Abstract

In summary, the field of computer science has long been intrigued by emotion recognition, which seeks to decode the emotional content hidden within data. Initial approaches to sentiment analysis were predominantly based on single-mode data sources like textual sentiment analysis, speech-based emotion detection, or the study of facial expressions. In recent years, with the increasingly abundant data representations, Many people has gradually pay attention on the multimodal emotion recognition. Multimodal emotion recognition involves not only text, but also audio, image, and video, which is of great significance for enhancing human-computer interaction, improving user experience, and improving emotion-aware applications. This paper thoroughly discusses the research advancements and primary techniques of multimodal emotion recognition tasks, with an emphasis on the aforementioned tasks. Specifically, this paper first introduces representative methods for single-modal emotion recognition based on graphic data, including its basic process, advantages and disadvantages, etc. Secondly, this article introduces pertinent studies on multimodal emotion recognition and offers a quantitative comparison of how different approaches perform on standard multimodal data sets. Lastly, it addresses the complexities inherent in multimodal emotion recognition research and suggests potential areas for future study.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.