On the Evaluation and Implementation of LSTM Model for Speech Emotion Recognition Using MFCC

Sheetal U Bhandari,Triveni D Dhamale,Varsha K Harpale,Harshawardhan S Kumbhar

doi:10.1007/978-981-16-7182-1_33

Sheetal U Bhandari, Triveni D Dhamale + Show 2 more

https://doi.org/10.1007/978-981-16-7182-1_33

Copy DOI

Export

Save

Cite

Publication Date: Jan 1, 2022

Citations: 3

Affiliation: Savitribai Phule Pune University

Abstract
Full-Text
Similar Papers

Abstract

Listen

AbstractSpeech emotion recognition is an emerging research field and is expected to benefit many application domains by providing effective human–computer interface. Researchers are extensively working toward decoding of human emotions through speech signal in order to achieve effective interface and smart response by computers. The perfection of speech emotion recognition greatly depends upon the types of feature used and also on the classifier employed for recognition. The contribution of this paper is to evaluate twelve different long short-term memory (LSTM) network models as classifier based on Mel frequency cepstrum coefficient (MFCC) feature. The paper presents performance evaluation in terms of important parameters such as: precision, recall, F-measure and accuracy for four emotions like happy, neutral, sad and angry using the emotional speech databases, namely Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS). The measurement accuracy obtained is 89% which is 9.5% more than reported in recent literature. The suitable LSTM model is further successfully implemented on Raspberry Pi board creating stand-alone speech emotion recognition system.KeywordsHuman–computer interactionSERMFCCLSTMSpeech emotion recognition

Full Text