Two-level discriminative speech emotion recognition model with wave field dynamics: A personalized speech emotion recognition method

Ning Jia,Chunjun Zheng

doi:10.1016/j.comcom.2021.09.013

Abstract

Presently available speech emotion recognition (SER) methods generally rely on a single SER model. Getting a higher accuracy of SER involves feature extraction method and model design scheme in the speech. However, the generalization performance of models is typically poor because the emotional features of different speakers can vary substantially. The present work addresses this issue by applying a two-level discriminative model to the SER task. The first level places an individual speaker within a specific speaker group according to the speaker’s characteristics. The second level constructs a personalized SER model for each group of speakers using the wave field dynamics model and a dual-channel general SER model. Two-level discriminative model are fused for implementing an ensemble learning scheme to achieve effective SER classification. The proposed method is demonstrated to provide higher SER accuracy in experiments based on interactive emotional dynamic motion capture (IEMOCAP) corpus and a custom-built SER corpus. In IEMOCAP corpus, the proposed model improves the recognition accuracy by 7%. In custom-built SER corpus, both masked and unmasked speakers is employed to demonstrate that the proposed method maintains higher SER accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Two-level discriminative speech emotion recognition model with wave field dynamics: A personalized speech emotion recognition method

Abstract

Talk to us

Similar Papers

More From: Computer Communications

Lead the way for us

Journal: Computer Communications	Publication Date: Sep 22, 2021
Citations: 4

Similar Papers

Algorithm for speech emotion recognition classification based on Mel-frequency Cepstral coefficients and broad learning system
Zhiyou Yang ... Ying Huang
Evolutionary Intelligence | VOL. 15
Zhiyou Yang, et. al.Zhiyou Yang ... Ying Huang
14 Jan 2021
Evolutionary Intelligence | VOL. 15

Speech emotion recognition based on Fuzzy Least Squares Support Vector Machines
Shiqing Zhang
-
Shiqing Zhang Shiqing Zhang
01 Jan 2008
01 Jan 2008

An Attention Pooling Based Representation Learning Method for Speech Emotion Recognition
Pengcheng Li ... Wu Guo
-
Pengcheng Li, et. al.Pengcheng Li ... Wu Guo
02 Sep 2018
02 Sep 2018

Emotion Recognition Combining Acoustic and Linguistic Features Based on Speech Recognition Results
Misaki Sakurai ... Tetsuo Kosaka
-
Misaki Sakurai, et. al.Misaki Sakurai ... Tetsuo Kosaka
12 Oct 2021
12 Oct 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Two-level discriminative speech emotion recognition model with wave field dynamics: A personalized speech emotion recognition method

Abstract

Talk to us

Similar Papers

More From: Computer Communications