Abstract

The feasibility of using the formant analysis-synthesis approach to replace the voicing sources of esophageal speech was explored. Using inverse-filtered signals extracted from normal speakers provided the voicing sources. Pitch extraction was tested with various pitch extraction methods, and then a computationally simple, band-limited auto-correlation method was chosen. To accomplish stable and practical speech enhancement, the input signal was divided into low- and high-frequency channels, then only the low-frequency channel was processed by the formant analysis-synthesis method. A special purpose DSP-hardware unit was designed to perform the proposed analysis-synthesis process in real-time. Subjective evaluation tests (rating scale method) have been made with seven well-trained esophageal speakers and three speech therapists. Results of the subjective test showed that the synthesized speech was significantly improved, especially in cases of “loudness”, “sonority”, “strained”, “stoma noise”, “choppy”, “stability”, “intelligibility”, “recognizability”, and “duration” features.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call