Abstract

Recognition of human emotion from facial expression is affected by distortions of pictorial quality and facial pose, which is often ignored by traditional video emotion recognition methods. On the other hand, context information can also provide different degrees of extra clues, which can further improve the recognition accuracy. In this paper, we first build a video dataset with seven categories of human emotion, named human emotion in the video (HEIV). With the HEIV dataset, we trained a context-aware attention network (CAAN) to recognize human emotion. The network consists of two subnetworks to process both face and context information. Features from facial expression and context clues are fused to represent the emotion of video frames, which will be then passed through an attention network and generate emotion scores. Then, the emotion features of all frames will be aggregated according to their emotional score. Experimental results show that our proposed method is effective on HEIV dataset.

Highlights

  • Estimating a person’s emotional state is essential in our everyday life. is capacity is necessary to perceive and anticipate people’s reactions [1]

  • Chu et al [2] proposed a human emotion recognition method based on facial action coding system, which encodes facial expressions through a series of specific location movements of the face. e action units can be identified by geometric features and appearance features extracted from face images [3]

  • To overcome the above two challenges, inspired by the attention mechanism [8, 9], we propose a context-aware attention network (CAAN), which is robust to frames containing less emotional information and simultaneously uses the rich emotional clues provided by the other frames

Read more

Summary

Introduction

Estimating a person’s emotional state is essential in our everyday life. is capacity is necessary to perceive and anticipate people’s reactions [1]. Is capacity is necessary to perceive and anticipate people’s reactions [1] This emotion recognition challenge has a wide range of applications. E human face contains rich emotional clues. The context information can provide extra clues to recognize emotion. Kosti et al [6] built an emotions-in-context database and showed the emotion recognition accuracy is improved when the person and the whole scene are jointly analyzed. Chen et al [7] exploited context clues, including events, objects, and scenes for video emotion recognition, to improve performance. These methods treat the features of different frames and the difference of emotional information

Objectives
Methods
Findings
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.