Abstract
Emotions are explicit and serious mental activities, which find expression in speech, body gestures and facial features, etc. Speech is a fast, effective and the most convenient mode of human communication. Hence, speech has become the most researched modality in Automatic Emotion Recognition (AER). To extract the most discriminative and robust features from speech for Automatic Emotion Recognition (AER) recognition has yet remained a challenge. This paper, proposes a new algorithm named Shifted Linear Discriminant Analysis (S-LDA) to extract modified features from static low-level features like Mel-Frequency Cepstral Coefficients (MFCC) and Pitch. Further 1-D Convolution Neural Network (CNN) was applied to these modified features for extracting high-level features for AER. The performance evaluation of classification task for the proposed techniques has been carried out on the three standard databases: Berlin EMO-DB emotional speech database, Surrey Audio-Visual Expressed Emotion (SAVEE) database and eNTERFACE database. The proposed technique has shown to outperform the results obtained using state of the art techniques. The results shows that the best accuracy obtained for AER using the eNTERFACE database is 86.41%, on the Berlin database is 99.59% and with SAVEE database is 99.57%.
Highlights
Emotion recognition includes analyzing an individual’s facial expressions, non-verbal communication, or speech signals and grouping them as a particular emotion
The results shows that the best accuracy obtained for Automatic Emotion Recognition (AER) is 86.41% for eNTERFACE database (Martin et al, 2006), 99.59% for Berlin database (Burkhardt et al, 2005) and 99.57% for Surrey Audio-Visual Expressed Emotion (SAVEE) database (Jackson and Haq, 2014)
The result shows that the best accuracy obtained for AER is 99.59% for the Berlin database which is better as compared to 99.57% for the SAVEE database, 86.41% for the eNTERFACE database using Mel-Frequency Cepstral Coefficients (MFCC) + Shifted Delta Coefficients (SDC) features and LDA feature selection
Summary
Emotion recognition includes analyzing an individual’s facial expressions, non-verbal communication, or speech signals and grouping them as a particular emotion. It has stated that emotion recognition is critical for regular living and is fundamental while interacting with others (Chavhan et al, 2015). The medical fields like psychiatry and mental illness which deals with the understanding of the negative emotions of an individual are the recent applications of Emotion Recognition. Speech signal varies under different emotions or stressed conditions as in (Hansen and Bou-Ghazale, 1995; Hansen and Womack, 1996; Ramamohan and Dandapat, 2006). Mental stress which is a common issue worldwide can be seen in human speech attributes like vocal jitter and Tiwari & Darji: A Novel S-LDA Features for Automatic Emotion Recognition from. Much research is going on in recognizing different emotions from speech modality, and over the last few decades, the Human-machine interface has significantly contributed to the field of medical assistance and psychiatry
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Mathematical, Engineering and Management Sciences
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.