Abstract

Neuromuscular disorders can lead to dysarthria. Dysarthria is a speech disorder that mainly affects the human motor speech system. It often results in reduced speech intelligibility. Speech intelligibility is one of the parameters to assess the severity of dysarthria. The intelligibility of speech decreases as the severity increases. Automatic speech intelligibility assessment could be reliable and cost-effective compared to the conventional methods which need experienced speech pathologists. Automatic intelligibility assessment includes feature extraction and classification. Features extracted from spectral and cepstral domains are investigated for use in intelligibility assessment. Frame level features are extracted from different signal representations in spectral and cepstral domains. Fisher vector encoding and temporal encoding that uses descriptive statistics are applied to convert frame level features into utterance level. These features are fed to PLDA and ANN classifiers for intelligibility assessment. Overall, the performance of Fisher vector encoding is found to be superior compared to temporal encoding. In all features, the performance of ANN is better in assessing the intelligibility levels compared to PLDA as expected. The STFT, as well as harmonics in spectral domain and CQCC in the cepstral domain, performed well for intelligibility assessment of unseen data of TORGO database.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call