Abstract
Multi-modal emotion recognition technology explores emotion recognition by integrating facial expression, voice intonation, text analysis and other multi-source data, so as to improve the naturalness and accuracy of human-computer interaction. Aiming at the emerging field of multi-modal emotion recognition, this paper introduces three single-modal emotion recognition methods of text, face and voice, especially the problem of multi-modal emotion fusion, and introduces the methods with high success rate of multi-modal emotion fusion recognition in recent years. Through comparative analysis, the conclusion is drawn that the current fusion methods are more complicated and the fusion success rate has been improved to some extent. However, the number of data sets on multi-modal emotion analysis is small, and the research on gesture and other modes of emotion recognition is also scarce. In the later stage, it is necessary to enrich the data set and add new modes to improve the accuracy and robustness of the multi-modal emotion recognition and analysis system.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have