Abstract

The vocal tract normalization problem arises from the fact that the spectrum envelopes of like vowels spoken by talkers with different vocal tract lengths appear as the same spectral pattern shifted along a log frequency scale. A narrow-band model of vowel perception was developed in which vowels are identified by comparing input narrow-band vowel spectra with broad-band pitch-independent vowel templates [J. M. Hillenbrand and R. A. Houde, J. Acoust. Soc. Am. 113, 1044–1055 (2003)]. An earlier evaluation of this model was carried out on a large database of vowels spoken by men, women, and children. In this evaluation the vocal tract normalization problem was bypassed by constructing separate templates for men, women, and children. The present study extended this work by examining two approaches to the vocal tract normalization problem: (1) An explicit normalization approach in which the spectral pattern of each input token was normalized by a factor based on the pitch of the token and (2) an approach in which templates were assembled by summing together spectral patterns that were sufficiently similar, thus creating multiple templates for each vowel. Performance of the model with these two approaches to vocal tract normalization was compared to listening tests on a large database of vowels spoken by men, women, and children.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.