Abstract
Formant measurement procedures often rely on there being a low fundamental frequency. An early study [B. Lindblom, International Congress of Phonetic Sciences, 4th, Helsingfors, 1961, 189–202 (1962)] found that the mean error in formant estimation ranged from about 40 Hz to a frequency of one‐fourth the fundamental. This study compares signal processing techniques for the estimation of formant frequencies and bandwidths of synthesized and natural speech characterized by a high fundamental frequency. Utterances were synthesized [D. H. Klatt, J. Acoust. Soc. Am. 67, 971–995 (1980)] using young children's utterances as models. The spectral and durational characteristics were matched closely by manipulating the synthesizer parameters. Spectrograms, discrete Fourier transforms, linear prediction envelopes, and auditory pseudospectrograms were computed for both the synthesized and natural utterances. The accuracy of formant estimation was judged by comparing the values determined by each of these methods to the known frequencies and bandwidths of the synthesized speech. Implications for formant estimation of natural speech will be discussed. [Work supported in part by a Whitaker Health Sciences Fellowship.]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.