Abstract
Facial Expression Recognition (FER) in the wild is a challenging task for affective computing in human-machine interaction fields. However, most of existing methods fail to learn the most prominent regions of facial images by simple cross-entropy loss due to the imbalance problem commonly existing in FER datasets, which limits the robustness and interpretability of the model. In addition, these methods only capture local features of original images with multi-size shallow convolution and ignore facial texture characteristics, leading to a suboptimal recognition performance. To address these issues, we propose a novel facial expression recognition network, named the attention-rectified and texture-enhanced cross-attention transformer feature fusion network (AR-TE-CATFFNet). Specifically, an attention-rectified convolution block is first designed to assist multiple convolution heads to focus on critical areas of human faces and improve the model generalization. Second, we investigate a texture enhancement block to capture texture features through local binary pattern and gray-level co-occurrence matrix, which solves the limitation of insufficient texture information. Finally, a cross-attention transformer feature fusion block is employed to deeply integrate RGB features and texture features globally, which is beneficial to boost the accuracy of recognition. Competitive experimental results on three public datasets validate the efficacy of the proposed method, indicating that our proposed method achieves superior classification performance of 89.50% on RAF-DB dataset, 65.66% on AffectNet dataset, and 74.84% on FER2013 dataset against the existing methods. The code of our proposed method will be available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/smy17/AR-TE-CATFFNet</uri> .
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.