Emotion recognition in speech using MFCC with SVM, DSVM and auto-encoder

Hadhami Aouani,Yassine Ben Ayed

doi:10.1109/atsip.2018.8364518

Hadhami Aouani, Yassine Ben Ayed

https://doi.org/10.1109/atsip.2018.8364518

Copy DOI

Export

Save

Cite

Publication Date: Mar 1, 2018

Citations: 27

Affiliation: University of Sfax

Abstract
Full-Text
Similar Papers

Abstract

Listen

Emotions recognition from speech is one of the most important sub domains in the field of signal processing. In this work, our system is a two-stage approach, namely feature extraction and classification engine. Firstly, two sets of feature are investigated which are: 39 Mel-frequency Cepstral Coefficient (MFCC) coefficients and 65 MFCC features extracted based on the work of [20]. Secondly, we use the Support Vector Machine (SVM) as the main classifier engine since it is the most common technique in the field of speech recognition. Besides that, we investigate the importance of the recent advances in machine learning including the deep kernel learning, as well as the various types of auto-encoder (the basic auto-encoder and the stacked auto-encoder). A large set of experiments are conducted on the SAVEE audio database. The experimental results show that DSVM method outperforms the standard SVM with a classification rate of 69.84% and 68.25% using 39 MFCC, respectively. Additionally, the auto-encoder method outperforms the standard SVM, yielding a classification rate of 73.01%.

Full Text