Abstract

Ambient assisted living technology makes use of smart devices to enhance the well-being of elder or disabled persons in their daily life. Assistive technology for people with neurological disorders gains importance and is a challenging task. Developing an intelligent speech assistive system for such patients is a kind of rehabilitation to improve their quality of life. One kind of neurological disorder is dysarthria, a motor speech impairment which causes articulatory deficits. People with speech impairment have difficulty in pronouncing words, which results in variations in phonemes, leading to speech impairment. To capture these variations effectively from impaired speech utterances, we propose a multi-view representation based disordered speech recognition system. Auditory images generated for the impaired speech utterances of different word classes are well discriminative in nature. Multi-view representations are formed using the auditory image-based features and cepstral features. The proposed multi-view representations provide better discrimination for acoustically similar but different word classes. The proposed approach is evaluated using the UA-SPEECH corpus with two different datasets, namely 100 common words and 15-word classes of acoustically similar pairs of words. The training and testing samples are collected at varied intelligibility levels such as high, medium, low, and very low. Compared with conventional hidden Markov models (HMM), deep neural network-HMM, constant-Q-transform (CQT) and other single view auditory image-based approaches, the proposed approach proves that even the very low intelligibility words are recognized properly, leading to improved performance of disordered speech recognition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call