Abstract

Formant measurement procedures often rely on there being a low fundamental frequency. An early study [B. Lindblom, International Congress of Phonetic Sciences, 4th, Helsingfors, 1961, 189–202 (1962)] found that the mean error in formant estimation ranged from about 40 Hz to a frequency of one‐fourth the fundamental. This study compares signal processing techniques for the estimation of formant frequencies and bandwidths of synthesized and natural speech characterized by a high fundamental frequency. Utterances were synthesized [D. H. Klatt, J. Acoust. Soc. Am. 67, 971–995 (1980)] using young children's utterances as models. The spectral and durational characteristics were matched closely by manipulating the synthesizer parameters. Spectrograms, discrete Fourier transforms, linear prediction envelopes, and auditory pseudospectrograms were computed for both the synthesized and natural utterances. The accuracy of formant estimation was judged by comparing the values determined by each of these methods to the known frequencies and bandwidths of the synthesized speech. Implications for formant estimation of natural speech will be discussed. [Work supported in part by a Whitaker Health Sciences Fellowship.]

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.