Abstract

Within the field of Automatic Speech Recognition (ASR) systems, facing impaired speech is a big challenge because standard approaches are ineffective in the presence of dysarthria. The first aim of our work is to confirm the effectiveness of a new speech analysis technique for speakers with dysarthria. This new approach exploits the fine-tuning of the size and shift parameters of the spectral analysis window used to compute the initial short-time Fourier transform, to improve the performance of a speaker-dependent ASR system. The second aim is to define if there exists a correlation among the speaker’s voice features and the optimal window and shift parameters that minimises the error of an ASR system, for that specific speaker. For our experiments, we used both impaired and unimpaired Italian speech. Specifically, we used 30 speakers with dysarthria from the IDEA database and 10 professional speakers from the CLIPS database. Both databases are freely available. The results confirm that, if a standard ASR system performs poorly with a speaker with dysarthria, it can be improved by using the new speech analysis. Otherwise, the new approach is ineffective in cases of unimpaired and low impaired speech. Furthermore, there exists a correlation between some speaker’s voice features and their optimal parameters.

Highlights

  • Dysarthria is a motor speech disorder caused by a neurological injury [1] affecting the brain areas responsible for speech

  • The results show that hybrid Deep Neural Network (DNN)-Hidden Markov Model (HMM) models outperform classical Gaussian Mixture Model (GMM)-HMM ones according to Word Error Rate (WER) measures

  • The first experiment results for CLIPS speakers show that for unimpaired speech, there exists an Optimal Region (OR) where the WER is minimised, and the OR shift size is very similar to the state-of-the-art value

Read more

Summary

Introduction

Dysarthria is a motor speech disorder caused by a neurological injury [1] affecting the brain areas responsible for speech. This damage manifests itself in different ways in different subjects, leading to several types of dysarthria. Speech disorders can affect the production, rhythm, pitch, rate, loudness, quality, and duration of speech. Dysarthric people comprise individuals with a primary speech disorder or those who experience a speech disorder as a result of a disease such as amyotrophic lateral sclerosis (ALS) or Parkinson’s Disease (PD). Dysarthria reduces speech intelligibility, affecting social interaction and quality of life. Despite the fact that in the last several years Automatic Speech Recognition (ASR)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call