Abstract

A pitch-synchronous analysis was carried out over the vowel portions of the CVC utterances HAYED, HOD, HODE and the sentence FEW THIEVES ARE NEVER SENT TO THE JUG recorded by a male speaker. For every pitch period, the analysis provides formant frequencies and the waveform of the vocal-cord excitation. The excitation waveform was replaced by a simulated excitation waveform, with which the utterances were resynthesized. In Expt. I, six simulated waveforms with pulse shapes differing in the number and location of slope discontinuities were investigated. Listening tests indicated that simulated excitations with pulse shapes with a single slope discontinuity at closure are preferred. In Expt. II, simulated excitations with 16 combinations of opening and closing times of a preferred pulse shape were investigated. Listening tests indicated that very small opening or closing times, or opening times approximately equal to or less than closing times, are not preferred. In general, it was demonstrated that good-quality synthetic speech can be generated by using simple excitation waveforms specified uniformly over an utterance. The use of tournament testing strategies for perceptual evaluation of speech samples is also described.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.