Abstract

Vocal tract is one of most important system in speech production and it begins at the glottis and ends at the lips. Vocal tract shape (VTS) is defined as varying cross sectional area from glottis-to-lips. Based on literature review it is noted that most of the research work carried out on vocal tract shape estimation (VTSE) is based on Wakita's algorithm which is based on autocorrelation of speech. The objective of this research work is to investigate VTSE based on formant frequencies, autocorrelation, covariance and lattice methods. For validation of results, data available for vocal tract shape for vowels from Magnetic Resonance Imaging (MRI) technique was used. Vowels /a/, /i/, /u/, /o/, vowel-semivowel-vowel utterances /aya/, /awa/ and some VCV syllables /apa/, /uba/ were analyzed for three female and three male speakers. From formant frequency, autocorrelation, covariance and lattice methods satisfactory results were obtained for vowels and semivowels. However, VTS for vowels based on formant frequency technique when compared with the MRI shapes were more realistic. From the investigation for effect of variation in analysis frame length on VTSE, it was observed that, lattice method required minimum analysis frame length compared to autocorrelation, and covariance methods, and estimated areas were more consistent across the analysis frames compared to other methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call