Abstract

Facial expression recognition (FER) in the wild is challenging due to various unconstrained conditions, i.e., occlusions and head pose variations. Previous methods tend to improve the performance of facial expression recognition through resorting to holistic methods or coarse local-based methods, while ignoring the local fine-grained feature structure knowledge and the correlation between features. In this paper, we propose a Fine-Grained Association Graph Representation (FG-AGR) framework which can capture the local fine-grained facial expression representation. Firstly, an Adaptive Salient Region Induction (ASRI) is designed for adaptively highlighting the local saliency regions of facial expressions combined with spatial location information. Based on this, a Local Fine-grained Feature Extraction (LFFE) based on Visual Transformers is introduced to further extract fine but discriminative fine-grained features of saliency regions. Thirdly, an Adaptive Graph Association Reasoning (AGAR) based on Graph Convolutional Network is constructed to learn associated fine-grained feature combinations. Extensive experiments demonstrate that our FG-AGR achieves superior performance compared to the state-of-the-art methods with 90.81% on RAF-DB, 64.91% on AffectNet-7, 60.69% on AffectNet-8 and 91.09% on FERPlus.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call