Abstract

An earlier experiment [Gerstman, Liberman, Delattre, and Cooper, J. Acoust. Soc. Am. 26, 952 (1954)] demonstrated that variations in the duration of a change in formant frequency can serve as an effective cue for distinguishing one class of speech sounds from another. The duration cue was found to operate independently of place cues; i.e., it served to distinguish the stop b from the semivowel w, both of which are articulated at a speaker's lips. Exploratory work now suggests that another temporal cue, noise duration, operates in much the same fashion, distinguishing among classes of sounds produced at the same place in the mouth. A tape recording of the spoken syllable ∫a (as in shop) was progressively shortened by cutting off increasing amounts of the noise portion with which the syllable begins. When the noise portion was approximately half its original length the syllable was heard as ∫a (as in chop), and when most of the noise had been removed the syllable was heard by some listeners as ka and by others as ta. (Inasmuch as ∫a and t∫a are both produced at a point in the mouth that lies be|tween the places where k and t are formed, the listeners' uncertainty as to whether the stop is k or t is understandable.) Thus the consonant, originally a fricative, became an affricate and then a stop. The experiment has been repeated with more highly controlled synthetic speech patterns. Here it was found that a second cue of some importance was the time required by the friction to rise to full intensity. [This work was supported in part by the Carnegie Corporation of New York, and in part by the Department of Defense in connection with Contract DA49-170-sc-1642.]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call