Abstract

The goal of emotion detection is to find and recognise emotions in text, speech, gestures, facial expressions, and more. This paper proposes an effective multimodal emotion recognition system based on facial expressions, sentence-level text, and voice. Using public datasets, we examine face expression image classification and feature extraction. The Tri-modal fusion is used to integrate the findings and to provide the final emotion. The proposed method has been verified in classroom students, and the feelings correlate with their performance. This method categorizes students' expressions into seven emotions: happy, surprise, sad, fear, disgust, anger, and contempt. Compared to the unimodal models, the suggested multimodal network design may reach up to 65% accuracy. The proposed method can detect negative feelings such as boredom or loss of interest in the learning environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call