The Impact of Attention Mechanisms on Speech Emotion Recognition.

Shouyan Chen,Xinqi Sun,Xiaofen Yang,Mingyan Zhang,Zhijia Zhao,Tao Zou

doi:10.3390/s21227530

Abstract

Speech emotion recognition (SER) plays an important role in real-time applications of human-machine interaction. The Attention Mechanism is widely used to improve the performance of SER. However, the applicable rules of attention mechanism are not deeply discussed. This paper discussed the difference between Global-Attention and Self-Attention and explored their applicable rules to SER classification construction. The experimental results show that the Global-Attention can improve the accuracy of the sequential model, while the Self-Attention can improve the accuracy of the parallel model when conducting the model with the CNN and the LSTM. With this knowledge, a classifier (CNN-LSTM×2+Global-Attention model) for SER is proposed. The experiments result show that it could achieve an accuracy of 85.427% on the EMO-DB dataset.

Highlights

Speech Emotion Recognition (SER) has a wide range of potential applications in areas, such as human–robot interactions [1], computer-aided instruction, e-commerce, and medical assistance [2,3].Traditional, effective analysis methods can be roughly divided into dictionary-based methods and machine-learning-based methods
To propose a Convolutional Neural Networks (CNNs)-LSTM×2+Global-Attention model: By comparing the training convergence speed, accuracy, and generalization ability of different models, we proposed a CNN-LSTM×2+global-attention model and conducted experiments on the EMO-DB dataset, which achieved an accuracy of 85.427%
CNN+ LSTM alone, and combined with the comparison of the confusion matrix as shown in Figures 12–14, we found that the prediction accuracy of the CNN+LSTM×2+GlobalAttention model is significantly better than that of the CNN network alone and the CNN+

Summary

Introduction

Speech Emotion Recognition (SER) has a wide range of potential applications in areas, such as human–robot interactions [1], computer-aided instruction, e-commerce, and medical assistance [2,3]. Traditional, effective analysis methods can be roughly divided into dictionary-based methods and machine-learning-based methods. The methods based on an emotion dictionary need to use an emotion dictionary which has been labeled manually. These methods rely heavily on the quality of the emotional dictionary, and the maintenance of the dictionary needs a lot of work and material resources. With the continuous emergence of new words, the dictionary cannot meet the application’s needs [4]. The machine-learning method has attracted great attention from researchers. Convolutional Neural Networks (CNNs) [3,5,6] and LSTM networks [7] and their combination [7,8,9]

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors	Publication Date: Nov 12, 2021
Citations: 19	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

The Impact of Attention Mechanisms on Speech Emotion Recognition.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Similar Papers

A Review on Speech Emotion Recognition Using Deep Learning and Attention Mechanism
Eva Lieskovská ... Michal Chmulík
Electronics | VOL. 10
Eva Lieskovská, et. al.Eva Lieskovská ... Michal Chmulík
13 May 2021
Electronics | VOL. 10

Robust emotion recognition in noisy speech via sparse representation
Xiaoming Zhao ... Bicheng Lei
Neural Computing and Applications | VOL. 24
Xiaoming Zhao, et. al.Xiaoming Zhao ... Bicheng Lei
29 Mar 2013
Neural Computing and Applications | VOL. 24

Time Dependent ARMA for Automatic Recognition of Fear-Type Emotions in Speech
J C Vásquez-Correa ... L D Avendaño
-
J C Vásquez-Correa, et. al.J C Vásquez-Correa ... L D Avendaño
01 Jan 2015
01 Jan 2015

In-depth investigation of speech emotion recognition studies from past to present –The importance of emotion recognition from speech signal for AI–
Yeşim Ülgen Sönmez ... Asaf Varol
Intelligent Systems with Applications | VOL. 22
Yeşim Ülgen Sönmez, et. al.Yeşim Ülgen Sönmez ... Asaf Varol
11 Mar 2024
Intelligent Systems with Applications | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Impact of Attention Mechanisms on Speech Emotion Recognition.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors