Learning Salient Features for Speech Emotion &lt;newline/&gt;Recognition Using Convolutional &lt;newline/&gt;Neural Networks

Qirong Mao,Ming Dong,Zhengwei Huang,Yongzhao Zhan

doi:10.1109/tmm.2014.2360798

Abstract

As an essential way of human emotional behavior understanding, speech emotion recognition (SER) has attracted a great deal of attention in human-centered signal processing. Accuracy in SER heavily depends on finding good affect- related , discriminative features. In this paper, we propose to learn affect-salient features for SER using convolutional neural networks (CNN). The training of CNN involves two stages. In the first stage, unlabeled samples are used to learn local invariant features (LIF) using a variant of sparse auto-encoder (SAE) with reconstruction penalization. In the second step, LIF is used as the input to a feature extractor, salient discriminative feature analysis (SDFA), to learn affect-salient, discriminative features using a novel objective function that encourages feature saliency, orthogonality, and discrimination for SER. Our experimental results on benchmark datasets show that our approach leads to stable and robust recognition performance in complex scenes (e.g., with speaker and language variation, and environment distortion) and outperforms several well-established SER features.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning Salient Features for Speech Emotion <newline/>Recognition Using Convolutional <newline/>Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Journal: IEEE Transactions on Multimedia	Publication Date: Dec 1, 2014
Citations: 549

Similar Papers

Speech Emotion Recognition Using CNN
Zhengwei Huang ... Yongzhao Zhan
-
Zhengwei Huang, et. al.Zhengwei Huang ... Yongzhao Zhan
03 Nov 2014
03 Nov 2014

An Ensemble Model for Multi-Level Speech Emotion Recognition
Chunjun Zheng ... Ning Jia
Applied Sciences | VOL. 10
Chunjun Zheng, et. al.Chunjun Zheng ... Ning Jia
26 Dec 2019
Applied Sciences | VOL. 10

Robust emotion recognition in noisy speech via sparse representation
Xiaoming Zhao ... Shiqing Zhang
Neural Computing and Applications | VOL. 24
Xiaoming Zhao, et. al.Xiaoming Zhao ... Shiqing Zhang
29 Mar 2013
Neural Computing and Applications | VOL. 24

Clustering-Based Speech Emotion Recognition by Incorporating Learned Features and Deep BiLSTM
Mustaqeem ... Muhammad Sajjad
IEEE Access | VOL. 8
Mustaqeem, et. al. Mustaqeem ... Muhammad Sajjad
01 Jan 2020
IEEE Access | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Salient Features for Speech Emotion &lt;newline/&gt;Recognition Using Convolutional &lt;newline/&gt;Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Learning Salient Features for Speech Emotion <newline/>Recognition Using Convolutional <newline/>Neural Networks