Abstract
Voice-driven communication assistive systems—speech enhancement (SE), voice conversion (VC), and automatic speech recognition with text-to-speech (ASR-TTS)—are recognized approaches for improving dysarthric speakers’ speech intelligibility. However, which approach performs better for moderate dysarthric patients is unclear. This study compared the benefits of three classic difference-type voice-driven assistive systems for dysarthric patients under identical test conditions. The benefits of the three systems for dysarthric patients’ speech intelligibility were compared; 14 mild-to-severedysarthric patients and five speakers with normal speech were invited to record the training sets for these systems. Five moderate dysarthric patients were selected to record two additional testing sets, which were used for evaluating the systems’ benefits. Google Automatic Speech Recognition’s (Google ASR) evaluation metrics and listening tests verified each system’s speech intelligibility and quality. The speech intelligibility results produced by Google ASR were 7.0%, 22.9%, and 93.8% for the SE, VC, and ASR-TTS systems, respectively. Regarding the listening test, the performance of speech intelligibility and quality were 38.7%, 40.5%, 95.5%, and 1.81, 2.18, 4.56 for SE, VC, and ASR-TTS systems, respectively. The ASR-TTS system performed better than SE and VC. Furthermore, t-distributed stochastic neighbor embedding (t-SNE) analysis was used to additionally compare the differences between the systems. The t-SNE analysis results indicated that ASR-TTS’ phonetic posteriorgram features provided stable performance compared with the other speech features (log-power spectrum and spectra) in the SE and VC systems. Results showed that the ASR-TTS is a potential system to improve moderate dysarthric patients’ speech intelligibility and quality in future applications.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.