Abstract

Assistive tools that recognize impaired speech due to neurological disorders are emerging and its a fairly complex task. An Intelligent Impaired Speech Recognition system helps persons with speech impairment to improve their interactions with outside world. Impaired speakers have difficulty in pronouncing words which results in partial or incomplete speech contents. Existing Automatic Speech Recognition systems are not effective for Impaired Speech Recognition due to the speaker specific variations which depend on the severity of the neurological disorders. In this work, we have investigated two important approaches namely, Deep Neural Network-Hidden Markov Model and Lattice Free Maximum Mutual Information approach for effective recognition of impaired speech. The training and testing samples are collected from persons with different neurological disorders at varied intelligibility levels such as high, medium, low and very low. The recognition accuracy is evaluated and compared using two datasets namely 20 acoustically similar words and 50 words Impaired Speech Corpus in Tamil.

Highlights

  • Developing an assistive system for speech impairment due to neurological disorders is one of the complex pattern recognition tasks

  • We focus on investigating DEEP NEURAL NETWORK-HIDDEN MARKOV MODEL (DNN-HMM) approach and a lattice free Maximum Mutual Information (LF-mutual information (MMI)) approach for Impaired speech recognition

  • In case of 50 words impaired speech corpus in tamil dataset, the LF-MMI approach shows better improvement by 25.5%, 11.18% and 35.15% than that of HMM, Deep Neural Network (DNN)-HMM and Convolutional Neural Network (CNN) respectively

Read more

Summary

INTRODUCTION

Developing an assistive system for speech impairment due to neurological disorders is one of the complex pattern recognition tasks. We explore Lattice free-Maximum Mutual Information(LF-MMI) approach for impaired speech recognition where there is a need for better discrimination among incomplete utterances of different word classes. 50 words Impaired Speech Corpus in tamil Accuracy (%) 36.73 51.87 50.27 51.40 51.30 and Maximum Linear Likelihood Transformation (MLLT) is applied on the features and maintained the same number of Gaussians to form Triphone tri2b model. LF-MMI approach shows improvement by 3.38%, 2.33% and 22.93% than that of HMM, DNN-HMM and CNN in 20 acoustically similar words impaired speech corpus in tamil respectively. The performance of this approach is shown in the Table 4

VISUAL REPRESENTATION USING GAMMATONEGRAM
Methods
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call