Facial Expression Recognition (FER) has long been a challenging task in the field of computer vision. In this paper, we present a novel model, named Deep Attentive Multi-path Convolutional Neural Network (DAM-CNN), for FER. Different from most existing models, DAM-CNN can automatically locate expression-related regions in an expressional image and yield a robust image representation for FER. The proposed model contains two novel modules: an attention-based Salient Expressional Region Descriptor (SERD) and the Multi-Path Variation-Suppressing Network (MPVS-Net). SERD can adaptively estimate the importance of different image regions for FER task, while MPVS-Net disentangles expressional information from irrelevant variations. By jointly combining SERD and MPVS-Net, DAM-CNN is able to highlight expression-relevant features and generate a variation-robust representation for expression classification. Extensive experimental results on both constrained datasets (CK+, JAFFE, TFEID) and unconstrained datasets (SFEW, FER2013, BAUM-2i) demonstrate the effectiveness of our DAM-CNN model.
Read full abstract