Research on Speech Emotion Recognition Based on Teager Energy Operator Coefficients and Inverted MFCC Feature Fusion

Feifan Wang,Xizhong Shen

doi:10.3390/electronics12173599

Feifan Wang, Xizhong Shen

Open Access

PDF Available

https://doi.org/10.3390/electronics12173599

Copy DOI

Export

Save

Cite

Journal: Electronics	Publication Date: Aug 25, 2023
Citations: 2	License type: CC BY 4.0

Affiliation: Shanghai Institute of Technology

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

As an important part of our daily life, speech has a great impact on the way people communicate. The Mel filter bank used in the extraction process of MFCC has a better ability to process the low-frequency component of a speech signal, but it weakens the emotional information contained in the high-frequency part of the speech signal. We used the inverted Mel filter bank to enhance the feature processing of the high-frequency part of the speech signal to obtain the IMFCC coefficients and fuse the MFCC features in order to obtain I_MFCC. Finally, to more accurately characterize emotional traits, we combined the Teager energy operator coefficients (TEOC) and the I_MFCC to obtain TEOC&I_MFCC and input it into the CNN_LSTM neural network. Experimental results on RAVDESS show that the feature fusion using Teager energy operator coefficients and I_MFCC has a higher emotion recognition accuracy, and the system achieves 92.99% weighted accuracy (WA) and 92.88% unweighted accuracy (UA).

Full Text