Monday, July 21, 2008

Vocoders redux


While flying back from vacation, I caught up on my New Yorker reading on the plane. The June 23rd issue had an article about voice recognition & synthesis that included mention of voders (the topic of my last post) as well as von Kempelen's contraption:
In the late eighteenth century, a Hungarian inventor named Wolfgang von Kempelen built a speaking machine by modelling the human vocal tract, using a bellows for lungs, a reed from a bagpipe for the vocal folds, and a keyboard to manipulate the "mouth." By playing the keys, an operator could form complete phrases in several different languages.
Reader po8 had a great comment/correction regarding my description of vocoders:
It looks like your description of how a vocoder works was taken from the Wikipedia article. Sadly, that article appears to be more-than-usually broken. (Why would one *ever* filter a given frequency with bandpass filters at other frequencies? This is just called "attenuation", and there's easier ways to get it.) Fortunately, their first reference link is to http://www.paia.com/ProdArticles/vocodwrk.htm, which provides a clear explanation of how a particular analog vocoder works, and confirms my recollections of the process.

The traditional analog vocoder is indeed a two stage process. In the encoding stage, human speech is sampled by a bank of fairly narrow bandpass filters at frequencies chosen to capture important speech features, and then the amplitudes of the filter bank outputs are measured.

The compression that was the target of the original algorithm comes from the fact that the amplitudes of the signals coming out of the filter bank carry most of the speech information, but are smooth in such a way that they can be compactly represented and transmitted infrequently.

In the decoding stage, of a traditional vocoder, the band amplitudes coming from the encoder are used to modulate sinusoids at the bank center frequencies, recovering enough of the original signal that the speech is understandable. This also gives the vocoder its characteristic "choir" sound.

In electronic music, one typically uses a second filter bank to filter frequencies out of a musical source such as a guitar chord, and then modulates the output of the second bank with the amplitudes from the first bank. This is how the sort of "talking guitar" effects one often hears are produced.

While LPC etc are technically vocoding, they bear so little resemblance to the original technology that they are usually referred to by other names.

Labels:

0 Comments:

Post a Comment

Links to this post:

Create a Link

<< Home