Perception of fricatives synthesized by higher-level control of a Klatt synthesizer

David R. Williams

doi:10.1121/1.411151

Abstract

Results of perceptual tests of fricatives synthesized using an acoustic-articulatory model [K. N. Stevens and C. A. Bickley, J. Phonet. 19, 161–174 (1991)] are presented. Parameters of the model permit time-varying control of vocal-tract shape (first four natural frequencies) and of glottal and oral cross-sectional areas. After computing air flow and intraoral pressure values, KLSYN88 synthesizer source parameters are estimated using mapping equations. The current study examines model predictions for intervocalic alveolar and labio-dental fricatives. Stimulus sets were constructed by varying the sizes of peak glottal opening (8–20 mm2) and minimum oral constriction (4–16 mm2), and the rate of oral constriction near closure/release (4, 16 cm2/s). Subjects labeled the synthetic fricatives as voiced or voiceless and rated the ‘‘goodness’’ of the stimuli as exemplars of voiceless fricatives. As expected, stimuli with the smallest and largest glottal openings were judged as voiced and voiceless, respectively. At intermediate glottal opening values, voicing judgments were influenced by the relative sizes of the glottal and oral openings and, to a lesser extent, by oral constriction rate. Goodness ratings of the stimuli generally correlated with labeling judgments. The results demonstrate the robustness of speech sound categories generated in this manner. [Work supported by NIMH.]

Full Text