Speech

Compression of human speech is a specialized form of audio compression. Greater levels of compression can be achieved by modeling the human vocal tract, and taking advantage of redundancy in human speech.

Jspeex

Published in Non-Commercial Programs, Source Code, Speech

This project is an attempt to port the free Speex voice codec to a pure Java implementation.

Version 0.9.4 is shipping as of June, 2004.

http://sourceforge.net/projects/jspeex/

admin

Posted in June 27th, 2004

Speex

Published in Non-Commercial Libraries, Source Code, Speech

The Speex project aims to build a patent-free, Open Source/Free Software voice codec. Unlike other codecs like MP3 and Ogg Vorbis, Speex is designed to compress voice at low bitrates in the 8-32 kbps/channel range. Possible applications include VoIP, internet audio streaming, archiving of speech data (e.g. voice mail), and audio books. In some sense, it is meant to be complementary to the Ogg Vorbis codec.

Speex 1.1.5 was released in April, 2004.

http://www.speex.org/

* * * *

admin

Posted in April 25th, 2004

The OpenH323 Project

Published in Non-Commercial Libraries, Speech, MPEG/ITU-T Video Standarts, Audio

This open source project aims to create a free H.323 stack. The project was started as a reaction to the high cost of commercial implementations of audio and video compression code implementing the various components of H.323. Roger H. adds There are now several useful applications which use the library
including OpenMCU (a reliable multi person conference server)
and GnomeMeeting (a GTK/Gnome GUI client for Linux/BSD Unix.

Version 1.13.13 of OpenH323 shipped in Marh, 2004.

http://www.openh323.org/

* * * * *

admin

Posted in March 14th, 2004

ComPacketer

Published in Commercial Libraries, Speech

The folks at Compandent have this to say about their product: Compandent’s ComPacketer is a voice coder which achieves a breakthrough in quality/bit rate/complexity tradeoff. Compandent’s novel technology, the ComPacketer that operates at 2.8 kb/s, produces speech with quality which exceeds that of ITU-T G.723.1 at 6.3 kb/s. Using the ComPacketer, only graceful degradation is introduced in frame erasure environment, as compared to the higher quality degradation introduces by the common standards used for VoIP.

http://www.compandent.com/products_compacketer.htm

* * * * *

admin

Posted in March 7th, 2004

GAO Research Speech Codec Software

Published in Commercial Libraries, Speech

GAO Research sells speech codecs for quite a few different platforms, including a big batch of DSP parts. They support a wide varity of ITU formats, including G.729, G.711, G.722, and more.

http://www.gaoresearch.com/products/speechsoftware/speechsoftware.php

admin

Posted in December 21st, 2003

OTR Audio PDA Player Project

Published in Non-Commercial Programs, Speech

OTR stands for “Old Time Radio”. The owners of this project want to be able to listen to classic mono AM radio recordings on their Sharp PDAs.

http://otrplayer.sourceforge.net/

admin

Posted in July 23rd, 2003

Hawk Software Commercial Speech Codecs

Published in Commercial Libraries, Speech

Hawk Software is now selling a pair of codecs, LPC-10 and OpenLPC, that are suitable for Windows CE, CELinux, and other 32-bit embedded platforms. More are planned for future releases.

http://www.hawksoft.com/hawkvoice/commcodecs.shtml

admin

Posted in July 8th, 2003

Epigon Media Technologies

Published in Commercial Libraries, MP3/MPEG Audio, Speech, Audio

It’s a little hard for me to tell what Epigon is selling. They have a nice picture of some kind of board on their front page, but I don’t think they have any hardware for sale. They do appear to have audio codecs for MPEG-1, MPEG-2, and MPEG-4, as well as what appears to be a proprietary speech codec called eSpeech.

Note: Reader Ragu says that despite the problem siwith the web page, he is a big fan of their products. He is a satisfied customer who feels their audio codecs are the best.

http://www.epigonaudio.com/

* * * *

admin

Posted in July 2nd, 2003

Visual Text To Speech MP3

Published in Commercial Programs, MP3/MPEG Audio, Speech

This product takes your text, converts it to speech, then stores the result in one of several compressed formats, including MP3, Ogg Vorbis, and G.721.

http://www.visual-mp3.com/text-to-speech/

admin

Posted in June 7th, 2003

(1 Comment)

EasyAudio

Published in Commercial Libraries, Speech

The EasyAudio ActiveX Control adds speech handling capabilities to your Win32 program. Listed features include nice things such as support for popular codecs including G.729, G.711, and ADPCM, unicast and multicast support, AGC, jitter buffer management, and noise reduction. The web page gives a price of $1000 for the control, and $2000 for the source code. I hope that big price tag includes free distribution rights, but the web site is woefully short on license information.

Version 3.0 is shipping as of March, 2003.

http://www.lht2000.com/audio.html

admin

Posted in March 16th, 2003

Add to Delicious

Comment RSS

Trackback URL

AudioCodingWiki

Published in Links, Speech, Audio, Tutorials, Reference, Presentations

A nice set of links to AudioCoding information. Since this is a Wiki site, it is highly collaborative - registered users can provide updates and modifications to the site at will. (I think.)

http://www.audiocoding.com/wiki/

* * * * *

admin

Posted in December 13th, 2002

Add to Delicious

Comment RSS

Trackback URL

Nellymoser

Published in Companies/Organizations, Speech, Audio

Nellymoser is the leading provider of proprietary speech and audio software technology and solutions in the areas of compression, modification, synchronization and conversion. Our products improve speech and audio quality and efficiency in bandwidth-constrained environments while creating more immersive, interactive environments for your applications and services.

http://www.nellymoser.com/

* * * * *

admin

Posted in December 9th, 2002

Add to Delicious

Comment RSS

Trackback URL

(1 Comment)

Asao

Published in Commercial Libraries, Speech, Audio

The Asao libraries from Nellysoft have been designed specifically to address the need for a very small footprint, low bandwidth speech and audio compression. Asao will operate at a variety of bit rates (12/16/24/32 Kbps). This new technology can be rapidly harnessed for applications such as streaming over mobile data networks, Internet radio and embedded platforms such as toys and other consumer devices.

http://www.nellymoser.com/products/audio_compression_asao_fst.htm

admin

Posted in December 9th, 2002

Add to Delicious

Comment RSS

Trackback URL

Sase

Published in Commercial Libraries, Speech

The Sase libraries fron Nellysoft offer flexible compression options for both embedded and data packets based compression implementations It can operate in either a single bit rate or a multi-rate mode, offering compression rates (1.8/3.2/6.4 Kbps) to suit your application or the changing environment of packet based networks. Sase has the ability to switch bit rates on-the-fly to better handle changing network condition. At 3.2 Kbps Sase offers near toll quality, as you would expect when more bits are added the speech quality improves at 6.4 Kbps.

http://www.nellymoser.com/products/compression_fst.htm

* * * *

admin

Posted in December 9th, 2002

Add to Delicious

Comment RSS

Trackback URL

Voice Recording Applet SDK

Published in Commercial Libraries, Speech

The voice recording applet SDK is designed for the web developers and allows to record the voice from web site, to compress it and to send to the web server via HTTP. To playback the recorded voice from the server the embedded voice streaming player or a separate player as a voice streaming applet can be used. Both applets are designed in Java 1.1 and has a JavaScript interface.

http://www.vimas.com/ve_record_applet_sdk.htm

* * * * *

admin

Posted in September 15th, 2002

Add to Delicious

Comment RSS

Trackback URL

Intel Integrated Performance Primitives

Published in Commercial Libraries, Speech, Video, Image Compression, Audio

Intel has created a new library designed to deal with varous primitives used in Data Compression. Intel says: Intel® Integrated Performance Primitives (IPP) is a software library which provides a range of library functions for multimedia, audio codecs, video codecs (for example H.263, MPEG-4), image processing (JPEG), signal processing, speech compression (i.e. G.723, GSM ARM*) plus computer vision as well as math support routines for such processing capabilities.Unlike their previous libraries, this is now a commercial product which is going to cost you as much as $199. Works with Windows and Linux.

http://developer.intel.com/software/products/ipp/ipp30/overview.htm

admin

Posted in August 18th, 2002

Add to Delicious

Comment RSS

Trackback URL

MELP coder

Published in Source Code, Speech

Source code for A 2.4 Kbps MELP coder. Target is Sun OS4. Phil Frisbie did the detective work needed to determine that the MELP coder is now owned by ASPI, so if you want to use it, you need to talk to them about licensing. See them at www.aspi.com.

http://www.data-compression.com/melp1.2.tar.gz

* * *

admin

Posted in March 5th, 2002

Add to Delicious

Comment RSS

Trackback URL

Signal Compression Lab

Published in Papers/Documentation, Speech, Video, Audio, Tutorials, Reference, Presentations

UCSB research activities, including speech coding, audio compression, video coding. Lots of links to demos and publications.

http://scl.ece.ucsb.edu/index.htm

admin

Posted in February 22nd, 2002

Add to Delicious

Comment RSS

Trackback URL

MELP at 2.4Kbps

Published in Source Code, Speech

Source code and documentation of some nice low bit rate speech coding.

http://maya.arcon.com/ddvpc/melp.htm

admin

Posted in February 22nd, 2002

Add to Delicious

Comment RSS

Trackback URL

Wikipedia entry: Linear Predictive Coding

Published in Speech, Tutorials, Reference, Presentations

The Wikipedia talks about LPC. A very short definition.

http://en.wikipedia.org/wiki/Linear_predictive_coding

admin

Posted in January 27th, 2002

Add to Delicious

Comment RSS

Trackback URL

Wikipedia entry: Speech Coding

Published in Speech, Tutorials, Reference, Presentations

The Wikipedia article on speech coding. A very few good definitions, and a very few good links.

http://en.wikipedia.org/wiki/Speech_coding

admin

Posted in January 27th, 2002

Add to Delicious

Comment RSS

Trackback URL

Algorithm cuts VoIP bandwidth requirement

Published in Companies/Organizations, Speech

A company named Effnet Inc. is licensing a version of CRTP, a protocol that compresses packet headers in RTP streams. With small VOIP packets this can provide substantial savings.

http://www.eet.com/story/OEG20020108S0054

admin

Posted in January 8th, 2002

Add to Delicious

Comment RSS

Trackback URL

Compressione della voce a 2.4 Kbit/s

Published in Italian / Italiano, Papers/Documentation, Speech

Part of a Master’s Thesis on voice compression, in Italian.

http://www.cs.brandeis.edu/~gim/Papers/Cefriel.pdf

admin

Posted in January 2nd, 2002

Add to Delicious

Comment RSS

Trackback URL

Open G.729(A) Initiative

Published in Non-Commercial Libraries, Speech

VoiceAge, of Montreal, announces the “Open G.729(A) Initiative,” which allows developers to freely use their G.729(A) codec object code for non-commercial purposes. This initiative provides you with an opportunity to work with the G.729(A) codec for free while developing products or applications. Take advantage of voice compression to prove that VoIP works efficiently and provides good voice quality.

Note: this site went all-Flash - which means you will have to navigate to the Open G.729 page manually.

http://www.voiceage.com/

admin

Posted in January 1st, 2002

Add to Delicious

Comment RSS

Trackback URL

MELP Vocoder Algorithm

Published in Papers/Documentation, Speech

Atlanta Signal Processor, Inc., is nice enough to host his paper on their site. It gives a brief overview of the MELP Vocoder algorithm.

http://www.aspi.com/tech/specs/pdfs/melp.pdf

* * * * *

admin

Posted in December 15th, 2001

Add to Delicious

Comment RSS

Trackback URL

Differences Between Microsoft and Apple ADPCM Files

Published in Papers/Documentation, Speech

Apple has published a tech note describing the differences between these two file formats, which on the face of it ought to be identical.

http://developer.apple.com/technotes/tn/tn1081.html

admin

Posted in September 14th, 2001

Add to Delicious

Comment RSS

Trackback URL

Spirit Corp

Published in Commercial Libraries, Speech

Spirit has a wide variety of speech codecs for sale, including standard G.711, G.729 and so on, all the way down to proprietary 1200 bps coders.

http://www.spiritcorp.com/vocoders.html

admin

Posted in July 25th, 2001

Add to Delicious

Comment RSS

Trackback URL

RFC 2422 - G.726 - 32 Kbps ADPCM codec

Published in Standards, Speech

This RFC defines the 32 Kbps toll quality MIME type. The true specification for G.726 is owned by the ITU, and will not be generally available on the net. So knowing how it is encoded only somewhat useful.

http://www.faqs.org/rfcs/rfc2422.html

* * * * *

admin

Posted in July 15th, 2001

Add to Delicious

Comment RSS

Trackback URL

Speech Codecs

Published in Source Code, Speech

An ftp site with various speech codecs, including G.722, GSM, G.711, G.723, G.721, CELP, and LPC. Licensing and ownership of the C source varies.

ftp://ftp.cs.cmu.edu/project/fgdata/speech-compression/

* * * * *

admin

Posted in July 6th, 2001

Add to Delicious

Comment RSS

Trackback URL

(1 Comment)

Zarak Systems PSQM Testing

Published in Results, Companies/Organizations, Speech

Zarak Systems will perform Quality of Service testing using PSQM, the Perceptual Speech Quality Measurement. This web page will tell you a little bit about what that means to you.

http://www.psqm.com/

* *

admin

Posted in June 11th, 2001

Add to Delicious

Comment RSS

Trackback URL

« Older Entries