Abstract

Micro-Expressions (MEs) are involuntary and im-perceptible facial movements that reflect the underlying emotions and inner activities. Recently, ME recognition technology has been widely used in several fields such as medical treatment. Due to the subtle variations among the video sequence and the limited training data, the ME recognition task still remains a challenging problem. Existing methods tend to address the ME recognition problem from two aspects: (1) Data augmentation and (2) Expression signal amplification. Few works realize the importance of temporal variation hidden in the ME sequence. Based on the above observation, we propose a Graph Contrastive Learning (GCL) framework to effectively perceive subtle temporal variation for robust ME recognition. Specifically, the strong spatial feature representation is captured through the transformer-based ME feature encoder. Then, the proposed GCL builds the graph structure for the ME sequence and introduces the graph convolution to model the temporal relationship. To capture and highlight the temporal variation hidden in the ME sequence, a contrastive learning framework is designed to discriminately learn the differences between the normal and the abnormal ME samples. Both quantitative and qualitative experimental results show the effectiveness and superiority of our method compared with the prior state-of-the-arts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call