A robust model for domain recognition of acoustic communication using Bidirectional LSTM and deep neural network.

Sandeep Rathor,Sanket Agrawal

doi:10.1007/s00521-020-05569-0

Abstract

This paper proposes a robust model for domain recognition of acoustic communication by using Bidirectional LSTM and deep neural network. The proposed model consists of five layers namely: speech recognition, word embedding, a layer of Bidirectional LSTM (BiLSTM) followed by two fully connected layers (FC). Initially, speech is recognized and resultant text is preprocessed before passing to the proposed model to obtain the domain of communication. Word embedding takes the padded sentence as the input sequence and outputs the encoded sentence. LSTM layer is used to capture the temporal features while fully connected layers (FC) are responsible to capture the linear and nonlinear combination of those features. We compared the performance of our proposed model to the conventional machine learning algorithms such as SVM, KNN, Random forest, and Gradient boosting and found that proposed model outperforms with high accuracy, i.e., 90.09%.

Full Text