Orthogonal channel attention-based multi-task learning for multi-view facial expression recognition

Jingying Chen,Lei Yang,Lei Tan,Ruyi Xu

doi:10.1016/j.patcog.2022.108753

Abstract

Multi-view facial expression recognition (FER) is a challenging computer vision task due to the large intra-class difference caused by viewpoint variations. This paper presents a novel orthogonal channel attention-based multi-task learning (OCA-MTL) approach for FER. The proposed OCA-MTL approach adopts a Siamese convolutional neural network (CNN) to force the multi-view expression recognition model to learn the same features as the frontal expression recognition model. To further enhance the recognition accuracy of non-frontal expression, the multi-view expression model adopts a multi-task learning framework that regards head pose estimation (HPE) as an auxiliary task. A separated channel attention (SCA) module is embedded in the multi-task learning framework to generate individual attention for FER and HPE. Furthermore, orthogonal channel attention loss is presented to force the model to employ different feature channels to represent the facial expression and head pose, thereby decoupling them. The proposed approach is performed on two public facial expression datasets to evaluate its effectiveness and achieves an average recognition accuracy rate of 88.41% under 13 viewpoints on Multi-PIE and 89.04% under 5 viewpoints on KDEF, outperforming state-of-the-art methods.

Full Text