Speech emotion recognition using recurrent neural networks with directional self-attention

Dongdong Li,Jinlin Liu,Zhuo Yang,Linyu Sun,Zhe Wang

doi:10.1016/j.eswa.2021.114683

Abstract

As an important branch of affective computing, Speech Emotion Recognition (SER) plays a vital role in human–computer interaction. In order to mine the relevance of signals in audios an increase the diversity of information, Bi-directional Long-Short Term Memory with Directional Self-Attention (BLSTM-DSA) is proposed in this paper. Long Short-Term Memory (LSTM) can learn long-term dependencies from learned local features. Moreover, Bi-directional Long-Short Term Memory (BLSTM) can make the structure more robust by direction mechanism because that the directional analysis can better recognize the hidden emotions in sentence. At the same time, autocorrelation of speech frames can be used to deal with the lack of information, so that Self-Attention mechanism is introduced into SER. The attention weight of each frame is calculated with the output of the forward and backward LSTM respectively rather than calculated after adding them together. Thus, the algorithm can automatically annotate the weights of speech frames to correctly select frames with emotional information in temporal network. When evaluate it on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) database and Berlin database of emotional speech (EMO-DB), the BLSTM-DSA demonstrates satisfactory performance on the task of speech emotion recognition. Especially in emotion recognizing of happiness and anger, BLSTM-DSA achieves the highest recognition accuracies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speech emotion recognition using recurrent neural networks with directional self-attention

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications

Lead the way for us

Journal: Expert Systems with Applications	Publication Date: Feb 12, 2021
Citations: 85

Similar Papers

An Ensemble Model for Multi-Level Speech Emotion Recognition
Chunjun Zheng ... Ning Jia
Applied Sciences | VOL. 10
Chunjun Zheng, et. al.Chunjun Zheng ... Ning Jia
26 Dec 2019
Applied Sciences | VOL. 10

Speech Emotion Recognition Method Using Depth Wavefield Extrapolation and Improved Wave Physics Model
Chunjun Zheng ... Ning Jia
-
Chunjun Zheng, et. al.Chunjun Zheng ... Ning Jia
01 Mar 2021
01 Mar 2021

Comparative Analysis of Deep Learning Approaches for Twitter Text Classification
Lukesh Kadu
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 06
Lukesh KaduLukesh Kadu
21 Oct 2022
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 06

Speech Emotion Recognition Based on Self-Attention Weight Correction for Acoustic and Text Features
Jennifer Santoso ... Takeshi Yamada
IEEE Access | VOL. 10
Jennifer Santoso, et. al.Jennifer Santoso ... Takeshi Yamada
01 Jan 2021
IEEE Access | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speech emotion recognition using recurrent neural networks with directional self-attention

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications