Abstract

The importance of speech command recognition in a human-machine interaction system is increased in recent years. In this study, we propose a deep neural network-based system for acoustic and throat command speech recognition. We apply a preprocessed pipeline to create the input of the deep learning model. Firstly, speech commands are decomposed into components using well-known signal decomposition techniques. The Mel-frequency cepstral coefficients (MFCC) feature extraction method is applied to each component of the speech commands to obtain the feature inputs for the recognition system. At this stage, we apply and compare performance using different speech decomposition techniques such as wavelet packet decomposition (WPD), continuous wavelet transform (CWT), and empirical mode decomposition (EMD) in order to find out the best technique for our model. We observe that WPD shows the best performance in terms of classification accuracy. This paper investigates long short-term memory (LSTM)-based recurrent neural network (RNN), which is trained using the extracted MFCC features. The proposed neural network is trained and tested using acoustic speech commands. Moreover, we also train and test the proposed model using a throat mic. speech commands as well. Lastly, the transfer learning technique is employed to increase the test accuracy for throat speech recognition. The weights of the model train with the acoustic signal are used to initialize the model used for throat speech recognition. Overall, we have found significant classification accuracy for both acoustic and throat command speech. We obtain LSTM is much better than the GMM-HMM model, convolutional neural networks such as CNN-tpool2 and residual networks such as res15 and res26 with an accuracy score of over 97% on Google’s Speech Commands dataset and we achieve 95.35% accuracy on our throat speech data set using the transfer learning technique.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call