Abstract

A novel technique is presented for the analysis of Speech Emotion Recognition (SER) using Ramanujan Fourier Transform (RFT). The unique method involves numerically encoding the speech emotion data before applying the RFT. The RFT's foundation is the projection of the obtained numerical series onto a collection of fundamental functions made up of Ramanujan sums (RS). In RS components, SER data base such as Berlin, eNTERFACE, RAVDESS, SAVEE, EMOVO, EmoFilm, and Urdu are considered for testing the accuracy. This research work proposes on RFT feature based speech emotion classification. The speech emotion samples was analyzed by Ramanujan Fourier Transform and the statistical feature extraction was carried out, fed to the machine learning classifiers. The multiclass SVM based speech emotion classification was found to be proficient, when compared with the KNN and Linear Discriminant Analysis classifiers. The algorithms are evaluated on seven data bases and the results reveals that, multiclass SVM out performs other classifiers in terms of accuracy. The RFT as a stand-alone feature recognizes speech emotion with an accuracy of 83.08% for Berlin, 82.67% for eNTERFACE’ 05, 81.79% for EmoFilm, 82.98% for RAVDESS, 82.99% for EMOVO, 84% for Urdu, and 83.75% for SAVEE databases using Multiclass SVM classifier. The outcome of this research work paves a way to the researchers in speech emotion analysis for real world applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call