Abstract
According to the sliding-template model of vowel perception, vowel quality is specified by the frequencies of the first three formants, and within-category variability in formant-patterns is primarily according to a single scaling-parameter [Nearey, J. Acoust. Soc. Am. 85, 2088-2113, 1989]. In this view, speaker normalization in vowel perception centers on the estimation of the spectral-scaling parameter for a speaker. This process may include consideration of “indirect” evidence such as fundamental frequency (f0), or information about speaker size or sex. This model also suggests the potential integration of vowel perception and speaker-size estimation via shared use of the estimated scaling parameter, which is related to speaker vocal-tract length and height. An experiment is presented where listeners were asked to listen to vowel stimuli whose formant patterns were potentially ambiguous between the /æ/ of a larger speaker and the /ʌ/ of a smaller speaker. Listeners were asked to make a vowel-category judgment, and to estimate the height of the apparent speaker in feet and inches. Results are consistent with predictions made by the sliding template model: apparent speaker size was predictive of perceived vowel quality independently of the acoustic characteristics of a sound, and f0 appears to affect vowel quality primarily indirectly.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have