Abstract
The intelligibility assessment of dysarthric speech is essential for planning therapy. Time-frequency representations have been used for automatic intelligibility assessment of dysarthria. These representations have been derived from the utterance as a whole. As voiced and unvoiced components have different characteristics; in this work, we use different time-frequency representations for voiced and unvoiced segments and use them for intelligibility assessment with CNN classifier. Finally, we combine the scores obtained by the two systems to assign an intelligibility level. The combined system's performance is found to be superior to the systems that used both voiced and unvoiced components separately or together as one utterance. The utterances of the TORGO database are used in the intelligibility assessment. Automatic assessment of speech intelligibility reduces speech-language pathologists' time and effort in assisting diagnosis and treatment design.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.