Abstract

Affective computing, a field at the intersection of cognitive science, linguistics, and AI, seeks to enhance human–computer interactions. Recognizing the complexity of human emotions, which manifest across various channels, this paper advocates for a multi-modal approach to accurately recognize emotions. Such an approach enables the discernment of subtle emotional cues in multiple modalities, thus advancing the field of multi-modal affective computing. Focusing on a trimodal framework, this paper examines emotion recognition and sentiment analysis through text, voice, and visual data. It outlines key developments, current trends, and prominent datasets in trimodal emotional analysis. It also explores data fusion strategies across modalities and assesses various fusion techniques’ effectiveness. The paper presents detailed emotion models, recent advancements, and key trimodal databases, while thoroughly addressing challenges like data processing and the complexities of TAC. Finally, it highlights potential future directions, underscoring the importance of benchmark databases and practical applications to deepen our understanding of the nuanced spectrum of human emotions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call