Cross-Language Speech Emotion Recognition Using Bag-of-Word Representations, Domain Adaptation, and Data Augmentation.

Shruti Kshirsagar,Tiago H Falk

doi:10.3390/s22176445

Abstract

To date, several methods have been explored for the challenging task of cross-language speech emotion recognition, including the bag-of-words (BoW) methodology for feature processing, domain adaptation for feature distribution “normalization”, and data augmentation to make machine learning algorithms more robust across testing conditions. Their combined use, however, has yet to be explored. In this paper, we aim to fill this gap and compare the benefits achieved by combining different domain adaptation strategies with the BoW method, as well as with data augmentation. Moreover, while domain adaptation strategies, such as the correlation alignment (CORAL) method, require knowledge of the test data language, we propose a variant that we term N-CORAL, in which test languages (in our case, Chinese) are mapped to a common distribution in an unsupervised manner. Experiments with German, French, and Hungarian language datasets were performed, and the proposed N-CORAL method, combined with BoW and data augmentation, was shown to achieve the best arousal and valence prediction accuracy, highlighting the usefulness of the proposed method for “in the wild” speech emotion recognition. In fact, N-CORAL combined with BoW was shown to provide robustness across languages, whereas data augmentation provided additional robustness against cross-corpus nuance factors.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors	Publication Date: Aug 26, 2022
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Cross-Language Speech Emotion Recognition Using Bag-of-Word Representations, Domain Adaptation, and Data Augmentation.

Abstract

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Similar Papers

Speaker to Emotion: Domain Adaptation for Speech Emotion Recognition with Residual Adapters
Yuxuan Xi ... Lirong Dai
-
Yuxuan Xi, et. al.Yuxuan Xi ... Lirong Dai
01 Nov 2019
01 Nov 2019

An End-to-end Multitask Learning Model to Improve Speech Emotion Recognition
Changzeng Fu ... Carlos Toshinori Ishi
-
Changzeng Fu, et. al.Changzeng Fu ... Carlos Toshinori Ishi
24 Jan 2021
24 Jan 2021

Speech Emotion Recognition Method Using Depth Wavefield Extrapolation and Improved Wave Physics Model
Chunjun Zheng ... Ning Jia
-
Chunjun Zheng, et. al.Chunjun Zheng ... Ning Jia
01 Mar 2021
01 Mar 2021

An Overview of Bag of Words;Importance, Implementation, Applications, and Challenges
Wisam A Qader ... Bilal I Ahmed
-
Wisam A Qader, et. al.Wisam A Qader ... Bilal I Ahmed
01 Jun 2019
01 Jun 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cross-Language Speech Emotion Recognition Using Bag-of-Word Representations, Domain Adaptation, and Data Augmentation.

Abstract

Talk to us

Similar Papers

More From: Sensors