Abstract

Recently short-time waveform analysis has been used to analyze, manipulate, and synthesize speech [e.g., Lienard, ICASSP-87]. Each waveform is described by six parameters: envelope attack and decay, reference instant, energy, internal frequency, and phase. In order to better understand the parameters, and to determine which carry pitch information in waveform-analyzed speech, perceptual experiments were performed using synthetic stimuli of 300 ms (formed by repeating 10-ms waveforms). The first experiments, with ten normal-hearing subjects, verified that the parameters affecting the perceived pitch were internal frequency, offset (repetition interval), and change in phase in successive waveforms; the remaining parameters primarily affected timbre. Next, the parameters found to affect pitch were varied individually and jointly to explore their interaction. These experiments showed that both frequency and offset could dominate perceived pitch. In the region explored, 250–500 Hz (ABX test), when either frequency or offset was 500 Hz, variation of the other parameter did not change the response from 500 Hz. However, when either parameter was fixed at 250 Hz, varying the other changed the pitch percept. Varying the two parameters together, the response changed from 250–500 Hz at the offset frequency closest to the perceptual midpoint between the two references, even though the internal frequency was higher. Finally, the parameters were varied in opposition in order to determine whether a particular parameter dominates.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call