Abstract

Speech Emotion Recognition(SER) has gained a lot of interest in recent times. The combination of different speech features improves the accuracy of the SER system. Whereas, this results in an increase of the time taken by the classifier to train the huge feature set. Also, there are some of the features that could not be useful for emotion recognition which leads to the decrease in the recognition accuracy. Therefore, in order to surmount this disadvantage, feature selection algorithms can be used in order to choose the most prominent features that could contribute highly for classification of the emotions efficiently. In this paper, a Feature Selection with Adaptive Structure Learning (FSASL) is used for selecting the appropriate features for SER. In the proposed SER system, the 1582 INTERSPEECH 2010 Paralinguistic features are extracted from the speech signal and the FSASL Feature Selection algorithm is used for selecting the best features from the huge feature set. The SVM and k-NN classifiers with 5-fold cross-validation scheme is used for classifying the emotions. EMO-DB, Berlin German database is used in this work and the Classification accuracy performance metric are considered for the evaluation of the proposed SER system. The results emphasize that the classification accuracy of the proposed SER system is improved remarkably upon using the FSASL algorithm as compared to the baseline as well as the existing SER systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call