Abstract

In summary, the field of computer science has long been intrigued by emotion recognition, which seeks to decode the emotional content hidden within data. Initial approaches to sentiment analysis were predominantly based on single-mode data sources like textual sentiment analysis, speech-based emotion detection, or the study of facial expressions. In recent years, with the increasingly abundant data representations, Many people has gradually pay attention on the multimodal emotion recognition. Multimodal emotion recognition involves not only text, but also audio, image, and video, which is of great significance for enhancing human-computer interaction, improving user experience, and improving emotion-aware applications. This paper thoroughly discusses the research advancements and primary techniques of multimodal emotion recognition tasks, with an emphasis on the aforementioned tasks. Specifically, this paper first introduces representative methods for single-modal emotion recognition based on graphic data, including its basic process, advantages and disadvantages, etc. Secondly, this article introduces pertinent studies on multimodal emotion recognition and offers a quantitative comparison of how different approaches perform on standard multimodal data sets. Lastly, it addresses the complexities inherent in multimodal emotion recognition research and suggests potential areas for future study.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call