Abstract

This paper presents the construction of Binary Support Vector Machines and its significance for efficient Speech Emotion Recognition (SER). German Emotional Speech Corpus EmoDB has been used in this study. Seven Binary Support Vector Machines (SVMs) corresponding to each of the seven emotions in the EmoDB, namely Anger-Not Anger, Boredom-Not Boredom, Disgust-Not Disgust, Fear-Not Fear, Happy-Not Happy, Sad-Not Sad and Neutral-Not Neutral are constructed. Features are selected for these seven Binary SVMs using Correlation Based Feature Selection (CFS) with Sequential Forward Selection (SFS). One Multiclass SVM is also constructed. Ten fold cross validation has been used and achieved an average accuracy of 95.32% for the Binary SVMs and 62.85% for the Multiclass SVM. The seven Binary SVMs and the Multiclass SVM are fused together using a combinator algorithm. All the SVMs are run in parallel by giving the SVM specific features as input. Fused model produced an average accuracy of 92.25% for the Binary SVMs and 77.07% for the Multiclass SVM on the test set. On the same test set using the combinator algorithm, the fused model has achieved an overall accuracy of 87.86% which is a significant improvement over the accuracies achieved in the previous studies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call