Abstract
With the rapid development of deep learning and wireless communication technology, emotion recognition has received more and more attention from researchers. Computers can only be truly intelligent when they have human emotions, and emotion recognition is its primary consideration. This paper proposes a multimodal emotion recognition model based on a multiobjective optimization algorithm. The model combines voice information and facial information and can optimize the accuracy and uniformity of recognition at the same time. The speech modal is based on an improved deep convolutional neural network (DCNN); the video image modal is based on an improved deep separation convolution network (DSCNN). After single mode recognition, a multiobjective optimization algorithm is used to fuse the two modalities at the decision level. The experimental results show that the proposed model has a large improvement in each evaluation index, and the accuracy of emotion recognition is 2.88% higher than that of the ISMS_ALA model. The results show that the multiobjective optimization algorithm can effectively improve the performance of the multimodal emotion recognition model.
Highlights
The concept of “emotional computing” was first proposed by professor Picard of the Massachusetts Institute of Technology in the book Affective Computing published in 1997
The multiobjective optimization algorithm is used to optimize the accuracy of model recognition and the uniformity of emotion recognition at the same time
This paper presents a multimodal emotion recognition model based on the multiobjective optimization algorithm
Summary
The concept of “emotional computing” was first proposed by professor Picard of the Massachusetts Institute of Technology in the book Affective Computing published in 1997. She defined “affective computing” as the calculation of factors related to human emotion, triggered by human emotion or able to affect emotion [1]. The external expression of human emotion mainly includes voice, facial expression, posture, and so on. Human speech contains linguistic information and nonlinguistic information such as people’s emotional state. Human speech can express emotion because it contains parameters that can reflect the characteristics of emotion. Facial expression is an important external form of emotion, which contains certain emotional information. The research of facial expression recognition can effectively promote the development of emotion recognition research and the research of automatic understanding of computer images [4,5,6]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.