Abstract

To investigate the effects of fundamental frequency (F0) and formant frequency shifts on vowel identification, a high-quality vocoder (‘‘STRAIGHT’’) was used to process the syllables ‘‘bit’’ and ‘‘bet’’ spoken by an adult female talker. From these two endpoints a nine-step continuum was generated by interpolation of the time-varying spectral envelope. Upward and downward frequency shifts in spectral envelope (scale factors of 0.75, 1.0, or 1.33) were combined with shifts in F0 (scale factors of 0.5, 1.0, or 1.25). Downward frequency shifts generally resulted in malelike voices whereas upward shifts were perceived as childlike. Matched frequency shifts, in which F0 and spectral envelope (i.e., formant frequencies) were shifted in the same direction, had relatively little effect on phoneme boundaries. Mismatched frequency shifts, in which F0 was modified independently of spectral envelope or vice versa, resulted in systematic boundary shifts. The changes in the identifications functions were qualitatively consistent with predictions of a model trained using acoustic measurements derived from a database of naturally spoken vowel tokens from men, women, and children. The empirical and modeling results are consistent with the idea that vowel boundary shifts are a consequence of listeners sensitivity to the statistical structure of natural speech.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call