Data augmentation and enhancement for multimodal speech emotion recognition

Jonathan Christian Setyono,Amalia Zahra

doi:10.11591/eei.v12i5.5031

Jonathan Christian Setyono, Amalia Zahra

Open Access

https://doi.org/10.11591/eei.v12i5.5031

Copy DOI

Abstract

Humans’ fundamental need is interaction with each other such as using conversation or speech. Therefore, it is crucial to analyze speech using computer technology to determine emotions. The speech emotion recognition (SER) method detects emotions in speech by examining various aspects. SER is a supervised method to decide the emotion class in speech. This research proposed a multimodal SER model using one of the deep learning based enhancement techniques, which is the attention mechanism. Additionally, this research addresses the imbalanced dataset problem in the SER field using generative adversarial networks (GAN) as a data augmentation technique. The proposed model achieved an excellent evaluation performance of 0.96 or 96% for the proposed GAN configuration. This work showed that the GAN method in the multimodal SER model could enhance performance and create a balanced dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bulletin of Electrical Engineering and Informatics	Publication Date: Oct 1, 2023
Citations: 2	License type: CC BY-SA 4.0

R Discovery Prime

R Discovery Prime

Data augmentation and enhancement for multimodal speech emotion recognition

Abstract

Talk to us

Similar Papers

More From: Bulletin of Electrical Engineering and Informatics

Lead the way for us

Similar Papers

Emotion Recognition Combining Acoustic and Linguistic Features Based on Speech Recognition Results
Misaki Sakurai ... Tetsuo Kosaka
-
Misaki Sakurai, et. al.Misaki Sakurai ... Tetsuo Kosaka
12 Oct 2021
12 Oct 2021

Algorithm for speech emotion recognition classification based on Mel-frequency Cepstral coefficients and broad learning system
Zhiyou Yang ... Ying Huang
Evolutionary Intelligence | VOL. 15
Zhiyou Yang, et. al.Zhiyou Yang ... Ying Huang
14 Jan 2021
Evolutionary Intelligence | VOL. 15

Speech Signal Imaging and Emotion Recognition Based on Symmetric-Diagonal Matrix Model
Zijun Yang ... Aoran Xi
-
Zijun Yang, et. al.Zijun Yang ... Aoran Xi
01 Jan 2023
01 Jan 2023

Speech emotion recognition based on Fuzzy Least Squares Support Vector Machines
Shiqing Zhang
-
Shiqing Zhang Shiqing Zhang
01 Jan 2008
01 Jan 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data augmentation and enhancement for multimodal speech emotion recognition

Abstract

Talk to us

Similar Papers

More From: Bulletin of Electrical Engineering and Informatics