Denoising Convolutional Autoencoder Based Approach for Disordered Speech Recognition

S Chandrakala,Veni S Vishnika

doi:10.1142/s0218213023500586

Abstract

Efficient assistive speech technology is essential for persons with cognitive disorders to improve their standard of life. Various kinds of cognitive disorders affect the speech articulation. Disordered Speech Recognition (DSR) can be used for rehabilitation and gain much importance as the disordered speakers population keeps increasing in recent years. The speech utterance is commonly represented in the form of spectrogram. Since the spectrograms are noisy and incomplete, the corresponding spectrograms need to be enhanced. We propose an approach that explores Denoising Convolutional Autoencoder (DCAE) to enhance the spectrograms of disordered speech utterances which are then utilized to train CNN to recognize the disordered words. Evaluation of proposed approach is carried out using the 20 words (acoustically similar word classes) dataset and 50 words dataset of Impaired speech corpus in Tamil and 100 common words dataset of UA-Speech database. Significantly better performance is achieved by proposed approach than HMM, DNN-HMM, LFMMI and CNN without enhancement. The spectrogram enhancement using DCAE helps to obtain better discrimination among overlapping disordered word classes and achieves a maximum Word Recognition Accuracy.

Full Text