Abstract

A set of ten vowel area functions, based on MRI measurements, has been parameterized by an “empirical orthogonal mode decomposition” which accurately represents each area function as the sum of the mean area function and proportional amounts of a series of orthogonal basis functions. The mean area function was found to possess a formant structure similar to that of a uniform tube (i.e., nearly equally spaced formants) suggesting that empirical orthogonal modes are perturbations on the mean (∼neutral) vowel shape much like past vocal tract analyses have considered perturbations on a uniform tube. The acoustic characteristics of the two most significant empirical orthogonal modes were examined, showing that both modes tend to increase the first formant as the modal amplitude coefficients are both increased from negative to positive values. However, the second formant was found to decrease in frequency for increasing values of the first modal coefficient and to increase for increasing values of the second mode coefficient. Next, a mapping between F1-F2 formant pairs and vocal tract area functions is proposed which is largely one-to-one but was initially limited by a constant vocal tract length. A possible method to include variable vocal tract length and higher ordered orthogonal modes in the mapping is given. The mode-to-formant mapping suggested the possibility of an inverse mapping to determine physiologically realistic area functions from a speech waveform and a simple example is presented. Finally, empirical orthogonal modes for a collection of ten vowels and eight consonants were derived and showed many similarities to those for the vowel-only case.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call