A deep interpretable representation learning method for speech emotion recognition

Erkang Jing,Yezheng Liu,Yidong Chai,Jianshan Sun,Sagar Samtani,Yuanchun Jiang,Yang Qian

doi:10.1016/j.ipm.2023.103501

Abstract

This paper focuses on the active interpretability for deep learning-based speech emotion recognition (SER). To achieve this, we propose an explicit feature constrained model, the interpretable group convolutional neural network (IG-CNN) model. In the proposed model, we first introduce the interpretability constraint to learn human-understandable interpretable representations. The emotion prediction decision can be active interpreted via the model coefficients. To acquire more representations beyond interpretable ones, and ensure they are useful for SER, we then design the uncorrelation constraint between interpretable and autonomous representations and introduce group CNN structure. We test the model on IEMOCAP, RAVDESS, eNTERFACE’05, and CREMA-D datasets. Experimental results show that our model outperforms all the baselines. In addition, the proposed model can also learn the patterns of human perception of speech emotion and provide explanation for the recognition results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A deep interpretable representation learning method for speech emotion recognition

Abstract

Talk to us

Similar Papers

More From: Information Processing and Management

Lead the way for us

Journal: Information Processing and Management	Publication Date: Sep 6, 2023
Citations: 8

Similar Papers

Emotion Recognition Combining Acoustic and Linguistic Features Based on Speech Recognition Results
Misaki Sakurai ... Tetsuo Kosaka
-
Misaki Sakurai, et. al.Misaki Sakurai ... Tetsuo Kosaka
12 Oct 2021
12 Oct 2021

Speech emotion recognition based on Fuzzy Least Squares Support Vector Machines
Shiqing Zhang
-
Shiqing Zhang Shiqing Zhang
01 Jan 2008
01 Jan 2008

Comparison between Fuzzy and NN Method for Speech Emotion Recognition
A.A Razak ... R Komiya
-
A.A Razak, et. al.A.A Razak ... R Komiya
04 Jul 2005
04 Jul 2005

Algorithm for speech emotion recognition classification based on Mel-frequency Cepstral coefficients and broad learning system
Zhiyou Yang ... Ying Huang
Evolutionary Intelligence | VOL. 15
Zhiyou Yang, et. al.Zhiyou Yang ... Ying Huang
14 Jan 2021
Evolutionary Intelligence | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A deep interpretable representation learning method for speech emotion recognition

Abstract

Talk to us

Similar Papers

More From: Information Processing and Management