Abstract

The ability to accurately synthesize electrolarynx (EL) speech may provide a basis for better understanding the acoustic deficits that contribute to its poor quality. Such information could also lead to the development of acoustic enhancement methods that would improve EL speech quality. This effort was initiated with an analysis-by-synthesis approach that used the Klatt formant synthesizer to study vowels at the end of utterances spoken by the same subjects both before and after laryngectomy (normal versus EL speech). The temporal and spectral features of the original speech waveforms were analyzed and the results were used to guide synthesis and to identify parameters for modification. EL speech consistently displayed utterance-final fixed (mono) pitch and normal-like falling amplitude across different vowels and subjects. Subsequent experiments demonstrated that it was possible to closely match the acoustic characteristics and perceptual quality of both normal and EL speech with synthesized replicas. It was also shown that the perceived quality of the synthesized EL speech could be improved by modification of pitch parameters to more closely resemble normal speech. Some potential approaches for modifying the pitch of EL speech in real time will be discussed. [Funded by NIH Grant R01 DC006449.]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call