Abstract

Despite the progress in computer-technology, with regard to Human-Computer Interaction (HCI), emotion recognition is still a challenging problem. In this paper, we present a novel multimodal emotion recognition system capable of recognizing emotions from audio, video, and text data using deep convolution neural networks. The system is able to recognize happy, angry, sad, afraid, disgust, surprise and neutral emotions. We used three datasets to train and test the system, one set for each of the three input formats. The results show a recognition accuracy rate of 100% for audio, 69% for video, and 64% for text. When applying the decision-level fusion, the recorded accuracy rate is 80%. These results confirm that the system is effective in recognizing human emotions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call