Improving Speech Emotion Recognition With Adversarial Data Augmentation Network.

Lu Yi,Man-Wai Mak

doi:10.1109/tnnls.2020.3027600

Abstract

When training data are scarce, it is challenging to train a deep neural network without causing the overfitting problem. For overcoming this challenge, this article proposes a new data augmentation network-namely adversarial data augmentation network (ADAN)- based on generative adversarial networks (GANs). The ADAN consists of a GAN, an autoencoder, and an auxiliary classifier. These networks are trained adversarially to synthesize class-dependent feature vectors in both the latent space and the original feature space, which can be augmented to the real training data for training classifiers. Instead of using the conventional cross-entropy loss for adversarial training, the Wasserstein divergence is used in an attempt to produce high-quality synthetic samples. The proposed networks were applied to speech emotion recognition using EmoDB and IEMOCAP as the evaluation data sets. It was found that by forcing the synthetic latent vectors and the real latent vectors to share a common representation, the gradient vanishing problem can be largely alleviated. Also, results show that the augmented data generated by the proposed networks are rich in emotion information. Thus, the resulting emotion classifiers are competitive with state-of-the-art speech emotion recognition systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Neural Networks and Learning Systems	Publication Date: Oct 9, 2020
Citations: 54	License type: mit

R Discovery Prime

R Discovery Prime

Improving Speech Emotion Recognition With Adversarial Data Augmentation Network.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Neural Networks and Learning Systems

Lead the way for us

Similar Papers

Investigation of multilingual and mixed-lingual emotion recognition using enhanced cues with data augmentation
S Lalitha ... Yousef Ajami Alotaibi
Applied Acoustics | VOL. 170
S Lalitha, et. al.S Lalitha ... Yousef Ajami Alotaibi
22 Jul 2020
Applied Acoustics | VOL. 170

Speech emotion recognition using data augmentation method by cycle-generative adversarial networks
Arash Shilandari ... Hossein Khosravi
Signal, Image and Video Processing | VOL. 16
Arash Shilandari, et. al.Arash Shilandari ... Hossein Khosravi
09 Feb 2022
Signal, Image and Video Processing | VOL. 16

RMWSaug: Robust Multi-window Spectrogram Augmentation Approach for Deep Learning based Speech Emotion Recognition
Shehu Mohammed Yusuf ... E A Adedokun
-
Shehu Mohammed Yusuf, et. al.Shehu Mohammed Yusuf ... E A Adedokun
06 Oct 2021
06 Oct 2021

Speech emotion recognition systems and their security aspects
Itzik Gurowiec ... Nir Nissim
Artificial Intelligence Review | VOL. 57
Itzik Gurowiec, et. al.Itzik Gurowiec ... Nir Nissim
21 May 2024
Artificial Intelligence Review | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving Speech Emotion Recognition With Adversarial Data Augmentation Network.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Neural Networks and Learning Systems