Formant Center Frequencies Research Articles

Synthetic speech has been used for decades to test theories regarding human speech perception. Typically, the speech has been constructed to sound as natural as possible given the constraints of the experiment. An alternative strategy has been to examine the perception of intentionally impoverished stimuli such as time‐varying sinusoid (TVS) replicas of speech [Remez et al., Science 212, 947‐950 (1981)]. The TVS signals consist of three tones whose frequencies mimic the formant center frequencies of a natural sentence. They exhibit few of the acoustic properties of natural speech. It has been demonstrated that TVS signals sound extremely unnatural although they are surprisingly intelligible. The goal of the present experiment was to determine some of the general acoustic characteristics of signals that are important for speech perception. This was accomplished by examining the perceptual consequences of adding simple temporal and spectral information to TVS sentences. The TVS signals were amplitude modulated at 100 Hz in order to give them more speechlike acoustic characteristics without giving them fundamental frequencies or harmonic structures. The modulation greatly improved the phonetic intelligibility of the acoustically sparse TVS signal. The modulated signal was also significantly more natural sounding to listeners than the unmodulated TVS signal. Performing this operation on natural speech, however, caused a decrement in intelligibility.

The perceptual role of the temporal fine-structure of vowel waveforms was investigated in five experiments. The interaction of the fundamental frequency and the first formant (F1) was shown to result in temporal patterns consisting of a number of cycles of F1 per fundamental frequency period. Changes in these patterns were shown to correlate with shifts in the perceptual boundary between /i/ and/I/. The results indicate that the perceptual system was responding to either a change in number of cycles of F1 per fundamental period or a change in the harmonic structure of the sounds. The hypothesized temporal cue was then used in synthesizing vowel- like sounds which, while not differing in formant center-frequency or harmonic structure, did differ in temporal structure. Subjects were able to match different sequences if the vowel-like sounds with sequences of natural vowels as predicted by their temporal properties. In a subsequent experi- ment, two pure tones were used as building blocks for synthesizing the vowels a, e, i, o, and u. With careful temporal modeling, the two tones proved sufficient fpr synthesizing intelligible tokens of the five vowels. Although the possibility exists that all the results may be explained in terms of spectrum, the indications are that temporal properties play a considerable role in vowel perception. Subject Classification: [43]70.30; [43]65.75.

Formant Center Frequencies Research Articles

Related Topics

Articles published on Formant Center Frequencies

Naturalness and intelligibility of amplitude modulated time‐varying sinusoidal speech

Line spectral pairs—Formant correlation in speech

On the perception of intonation from sinusoidal signals: Tone height and contour

Temporal factors in vowel perception

Vowel Identification in Isolation and in Word Context

Fundamental Frequency and Absolute Vowel Identification

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Formant Center Frequencies Research Articles

Related Topics

Articles published on Formant Center Frequencies

Naturalness and intelligibility of amplitude modulated time‐varying sinusoidal speech

Line spectral pairs—Formant correlation in speech

On the perception of intonation from sinusoidal signals: Tone height and contour

Temporal factors in vowel perception

Vowel Identification in Isolation and in Word Context

Fundamental Frequency and Absolute Vowel Identification