Abstract

Emotion recognition based on facial expressions has low accuracy and doubtful reliability because of the existence of fake expressions. In this paper, a method is proposed to recognize fake emotions based on multivisual information generated from single information sources, including facial expressions, eye states and physiological signals captured from video. An algorithm based on a graph neural network is used to extract spatial and spectral domain features from facial images for facial expression recognition. A model-based method is used for decomposing RGB signals into heart rates. A deep model trained by a labeled dataset that we created is used to segment the eye region. After obtaining the signals extracted from video, different fusion strategies are applied to evaluate emotion recognition performance based on multiple signals. In the experiment, the CK+, TFEID, JAFFE, RAF, PURE, and ESLD datasets are used to measure the accuracy of facial expression recognition, heart rate detection and eye region segmentation. The results show that multimodality is effective in improving accuracy and that eye state can be considered a cue for trusted emotion recognition. Compared with the method based on the eye state and noncontact physiological signal, the accuracy of multimodality can be improved by 26.19% and 9.52%, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call