Abstract

This article offers a comprehensive review of multimodal emotion recognition. Initially, we provide a succinct overview of human emotions by delving into various emotion models. The focus then shifts to the representation of emotions through unimodal data sources, illustrating how these singular channels capture emotional cues. As we progress, the article underscores the advancements in multimodal fusion techniques that amalgamate multiple data sources for more accurate emotion detection. Three predominant architectures for multimodal fusion are then detailed, each showcasing its unique approach and benefits. Despite the leaps and bounds made in this field, several challenges persist. The latter section of this review highlights these lingering issues in multimodal emotion recognition, hinting at potential areas for further research and development. By weaving together the intricacies of emotion models, unimodal representations, fusion techniques, and existing challenges, this paper seeks to provide readers with a holistic understanding of the current landscape of multimodal emotion recognition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call