Abstract

The analysis of human emotions plays a significant role in providing sufficient information about patients in monitoring their feelings for better management of their diseases. Audio-based emotions recognition has become a fascinating research interest for such domains during the last decade. Mostly, audio-based emotions systems depend on the recognition stage. The existing model has a common issue called objectivity suppositions problem, which might decrease the recognition rate. Therefore, this study investigates the improved version of a classifier that is based on hidden conditional random fields (HCRFs) model to classify emotional speech. In this model, we introduced a novel methodology that will incorporate multifaceted dissemination with the help of employing a combination of complete covariance Gaussian concreteness function. Due to this incorporation, the proposed model tackle most of the limitations of existing classifiers. Some of the well-known features like Mel-frequency cepstral coefficients (MFCC) are extracted in our experiments. The proposed model has been validated and evaluated on two publicly available datasets likes Berlin Database of Emotional Speech (Emo-DB) and the eNTER FACE’05 Audio-Visual Emotion dataset. For validation and comparison against the existing techniques, we utilized 10-fold cross validation scheme. The proposed method achieved significant improvement under the p-value <0.03 for classification. Moreover, we also prove that computational wise, our computation technique is less expensive against state of the art works.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call