Multi-Source Discriminant Subspace Alignment for Cross-Domain Speech Emotion Recognition

Shaokai Li,Peng Song,Wenming Zheng

doi:10.1109/taslp.2023.3288415

Abstract

Cross-domain speech emotion recognition (SER) is an effective strategy to improve the generalization ability of emotion classification models, which is an important research direction in speech signal processing. However, since the speech signals are non-stationary, it is difficult to train a robust classifier from single-source emotional corpus. To solve this shortcoming, we propose a novel method named multi-source discriminant subspace alignment (MDSA) for cross-domain SER. In MDSA, we first conduct linear discriminant analysis (LDA) in the multi-source domain. Then, the instances in the multi-source discriminant subspace are used to linearly reconstruct the instances in the target subspace. At the same time, the reconstruction contribution of each source discriminant subspace is determined by adaptive weights. Furthermore, the multi-source discriminant subspace is aligned by reducing the loss between projections, which can make our model more robust. In this way, MDSA considers both the alignment of cross-domain data distribution and the structural information of cross-domain instances. Finally, extensive experiments are conducted on five standard emotional corpora, i.e., Berlin, IEMOCAP, CVE, EMOVO, and TESS, and the results demonstrate the proposed MDSA is superior to several state-of-the-art transfer learning algorithms in terms of performance. The codes are available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/shaokai1209/MDSA.</uri>

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-Source Discriminant Subspace Alignment for Cross-Domain Speech Emotion Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Jan 1, 2023
Citations: 3

Similar Papers

Robust emotion recognition in noisy speech via sparse representation
Xiaoming Zhao ... Shiqing Zhang
Neural Computing and Applications | VOL. 24
Xiaoming Zhao, et. al.Xiaoming Zhao ... Shiqing Zhang
29 Mar 2013
Neural Computing and Applications | VOL. 24

Progress in speech emotion recognition
Xueying Zhang ... Shufei Duan
-
Xueying Zhang, et. al.Xueying Zhang ... Shufei Duan
01 Nov 2015
01 Nov 2015

Time Dependent ARMA for Automatic Recognition of Fear-Type Emotions in Speech
J C Vásquez-Correa ... L D Avendaño
-
J C Vásquez-Correa, et. al.J C Vásquez-Correa ... L D Avendaño
01 Jan 2015
01 Jan 2015

In-depth investigation of speech emotion recognition studies from past to present –The importance of emotion recognition from speech signal for AI–
Yeşim Ülgen Sönmez ... Asaf Varol
Intelligent Systems with Applications | VOL. 22
Yeşim Ülgen Sönmez, et. al.Yeşim Ülgen Sönmez ... Asaf Varol
11 Mar 2024
Intelligent Systems with Applications | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Source Discriminant Subspace Alignment for Cross-Domain Speech Emotion Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing