Abstract

Speech Emotion Recognition is always a complicated task in the domain of Speech Processing Research, though many research works have been done. The first and foremost challenge of SER is to selecting the Speech Emotion Database (Corpora), then extracting the related speech features and finally construct an appropriate Classification model. An effort is created during this work to discover the speech prosodies, spectral and combination of features with their dynamism to illustrate and classify the emotions of speech signal. The intrinsic or fine variations of speech samples are combined with the static delivery parameters within the Speech Emotion Recognition (SER) to refine the accuracy. The work in this paper, carried out the experiments on RAVDESS, IIITH IIITH-TEMD and our developed Database of native language DETL (Database for Emotions in Telugu Language) Speech Emotion Databases. This work extracted features like MFCC and Hybrid Features (MFCC+ΔMFCC+ΔΔMFCC) then finally applied those individual features and Combination of Features to different Classification models like SVM and MLP. We have got approximately 75%, 78% and 81% of accuracy for MLP with hybrid combination features on the above Databases respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call