Measuring the intelligibility of dysarthric speech through automatic speech recognition in a pluricentric language

Wei Xue,Catia Cucchiarini,Roeland Van Hout,Helmer Strik

doi:10.1016/j.specom.2023.02.004

Abstract

Speech intelligibility is an essential though complex construct for evaluating dysarthric speech. Various procedures can be used to measure speech intelligibility, most of which are based on subjective ratings assigned by experts. Since these procedures are subjective and laborious, automatic speech recognition (ASR) has been proposed to obtain objective metrics of intelligibility. Although promising results have been reported, ASR for dysarthric speech generally requires large amounts of data consisting of recorded and annotated speech. In the present study, we explored the possibility of using dysarthric speech resources from the dominant language variety to improve the performance of ASR systems on the dysarthric speech of the non-dominant variety of the same pluricentric language. Dutch is used as an example of a pluricentric language, with Netherlandic Dutch considered the dominant and Flemish Dutch the non-dominant variety. The performance of ASR is evaluated by using two types of intelligibility metrics: orthographic transcriptions and global intelligibility assessments, both obtained from experts. Overall, the results show that dysarthric speech data from the dominant language variety can contribute to improving automatic transcriptions and to developing objective, automatic global measures of speech intelligibility only when no data from the non-dominant variety are available for training ASR models.

Full Text