Abstract

Emotion recognition from speech has noticeable applications in the speech-processing systems. In this paper, the effect of using a rich set of features including formant frequency related, pitch frequency related, energy, and the two first mel-frequency cepstral coefficients (MFCCs) on improving the performance of speech emotion recognition systems is investigated. To do this, the different sets of features are employed, and by using the fast correlation-based filter (FCBF) feature selection method, some efficient feature subsets are determined. Finally, to recognize the emotion from speech, fuzzy ARTMAP neural network (FAMNN) architecture is used. Also, the genetic algorithm (GA) is employed to determine optimum values of the choice parameter (α), the vigilance parameters (ρa, ρb, and ρab), and the learning rate (β) of FAMNN. Experimental results show the improvement in emotion recognition rate of angry, happiness, and neutral states by using a subset of 25 selected features and the GA-optimized FAMNN-based emotion recognizer.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call