Abstract

This paper reports experimental results of speech emotion recognition by conventional machine learning methods and deep learning techniques. We use a selection of mel frequency cepstral coefficients (MFCCs) as features for the conventional machine learning classifiers. The convolutional neural network uses as features the mel spectrograms treated as images. We test both approaches over a state of the art free database that provides samples of 8 emotions recorded by 24 professional actors. We report and comment the accuracy achieved by each classifier in cross validation experiments. Results of our proposal are competitive with recent studies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call