Formant motion

khol · Post by **khol** » Mon Dec 17, 2007 2:30 pm

Heya.
Can someone explain the concept "formant motion" for me?

rfoshaug · Post by **rfoshaug** » Mon Dec 17, 2007 3:21 pm

I can try...

In order to understand the formant motion, you need to understand the vocoder. The Radias (i believe R3 is similar) vocoder is a 16-band vocoder and works like this:

#1. When you speak into a microphone and feed that sound into the synth, this sound (the "modulator") is analyzed and split into 16 bands or frequency ranges (so one measures the sound level for example from 50 to 100 hertz, another from 100 to 200 etc.). Similar to a hifi spectrum analyzer in which lights indicate the sound level at bass, midtone, and treble. Some stereos also have more than those 3 bands.

Of course the modulator can be another synth sound or a drum kit or anything, but speech is the most commonly used modulator sound source.

#2. Then as you play the keyboard with, say, a string sound (the "carrier"), this sound is routed through 16 bandpass filters. One of the filters only lets through, say, the 50-100 hertz portion of the string sound, another one only lets through 100-200 hertz etc. So each bandpass filter corresponds to the frequencies of the bands from point #1 above. And the analyzer in #1 is used to control the volume of each of these filters.

If you say "Baaa" into the microphone, there is more bass than treble, and so the lowest-frequency bandpass filters are heard more than the higher ones. If you on the other hand say "tsss" into the microphone, the higher-frequency bands of the analyzer detect more sound and increase the volume on the higher-frequency bandpass filters for the string sound. So the carrier (string sound) is shaped with these filters to respond to changes in the modulator (speech), and it sounds like the string sound is "singing" or "talking".

As for the formant motion, this is a way to store what the 16-band analyzer heard when you said a phrase into the microphone for example. That is, the sound levels detected by the 16 bands in #1 above are stored. So if you say the word "Radias" while recording formant motion, this sound ("Radias") is not stored as a sample or soundclip, but as 16 individual volumes that change over the duration of the formant motion.

In the example of the word "Radias", the high-frequency band of the formant motion would contain low levels in the first portion (as the letters "R" and "A" don't contain much of the high frequencies), but at the end of the formant motion recording, the high-frequency bands would go up when the "S" was prononunced.

This formant motion can then be used to control the vocoder's 16 bandpass filters to shape a string sound or anything you want so that it "speaks" the word "Radias" (ie. controls the bandpass filters according to variations in each band over time) without you having to say it into the microphone.

However, the result may not be exactly like if you say the word into the microphone, because in a normal vocoder program, you can also let some portion of your normal speech through and mix it with the carrier, and you can also let high treble through (sibilance level) so that sharp sounds like "t", "s" etc are let through. These options make it easier to hear and understand what the vocoder is saying, but cannot be stored in the formant motion, as the formant motion only contain the movement of those 16 volume levels over a time of max. 7.5 seconds in the Radias.

Hope this helps.

khol · Post by **khol** » Mon Dec 17, 2007 4:23 pm

oh my god... thats what i call an explanation.
Thank you very much.

I will read this several times

Argus · Post by **Argus** » Tue Dec 18, 2007 9:49 am

I didn't ask the question, but wow, that was a great explanation. thanks a lot, I learned something!

bambule · Post by **bambule** » Tue Dec 18, 2007 11:39 pm

Thanks !

futureline · Post by **futureline** » Wed Dec 26, 2007 4:15 am

Wow, that was a very, very nice explanation from a fellow norwegian,- thanks!
It actually put the understanding of this vocoder in to a new perspective for me.

nemmo · Post by **nemmo** » Wed Dec 26, 2007 4:56 pm

I will explain it in a much more simple way.

Try to say "hello, my name is Hal 9000" with your mouth wide open.

Can't you ?

Well, the movement of your mouth is the modulator.

When you record "formant motion", you are basically recording this modulation. That allows you to exchange the "carrier" signal (the sound made by your larinx) for another carrier signal (the sound of the Radias).

Uses ? think Harder, Better, Faster, Stronger.

botega · Post by **botega** » Wed Dec 26, 2007 9:06 pm

Try to say "hello, my name is Hal 9000" with your mouth wide open.

Can't you ?

now try this:

Three witches watch three swatch watches. Which witch watches which swatch watch? - but you have to say it at least three times and do it as fast as you can

Can't you?

slammah2012 · Post by **slammah2012** » Fri Dec 28, 2007 5:46 am

botega wrote:
Try to say "hello, my name is Hal 9000" with your mouth wide open.

Can't you ?
now try this:

Three witches watch three swatch watches. Which witch watches which swatch watch? - but you have to say it at least three times and do it as fast as you can

Can't you?

1)...Three witches watch three swatch watches. Which witch watches which swatch watch?......2)...Three witches watch three swatch watches. Which witch watches which swatch watch?......3)... I have Huge Testicles.....

Nope........I can't do it