Abstract

AbstractAutomatic recognition of human emotions is a relatively new field and is attracting significant attention in research and development areas because of the major contribution it could make to real applications. Previously, several studies reported speech emotion recognition using acted emotional corpus. For real world applications, however, spontaneous corpora should be used in recognizing human emotions from speech. This study focuses on speech emotion recognition using the FAU Aibo spontaneous children’s corpus. A method based on the integration of feed-forward deep neural networks (DNN) and the i-vector paradigm is proposed, and another method based on deep convolutional neural networks (DCNN) for feature extraction and extremely randomized trees as classifier is presented. For the classification of five emotions using balanced data, the proposed methods showed unweighted average recalls (UAR) of 61.1% and 59.2%, respectively. These results are very promising showing the effectiveness of the proposed methods in speech emotion recognition. The two proposed methods based on deep learning (DL) were compared to a support vector machines (SVM) based method and they demonstrated superior performance. KeywordsSpeech emotion recognitionSpontaneous corpusDeep neural networksFeature extractionExtremely randomized trees

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call