Abstract

Facial expression recognition (FER) is a challenging problem in the fields of pattern recognition and computer vision. The recent success of convolutional neural networks (CNNs) in object detection and object segmentation tasks has shown promise in building an automatic deep CNN-based FER model. However, in real-world scenarios, performance degrades dramatically owing to the great diversity of factors unrelated to facial expressions, and due to a lack of training data and an intrinsic imbalance in the existing facial emotion datasets. To tackle these problems, this paper not only applies deep transfer learning techniques, but also proposes a novel loss function called weighted-cluster loss, which is used during the fine-tuning phase. Specifically, the weighted-cluster loss function simultaneously improves the intra-class compactness and the inter-class separability by learning a class center for each emotion class. It also takes the imbalance in a facial expression dataset into account by giving each emotion class a weight based on its proportion of the total number of images. In addition, a recent, successful deep CNN architecture, pre-trained in the task of face identification with the VGGFace2 database from the Visual Geometry Group at Oxford University, is employed and fine-tuned using the proposed loss function to recognize eight basic facial emotions from the AffectNet database of facial expression, valence, and arousal computing in the wild. Experiments on an AffectNet real-world facial dataset demonstrate that our method outperforms the baseline CNN models that use either weighted-softmax loss or center loss.

Highlights

  • Facial expressions are undoubtedly a dominant, natural, and effective channel used by people to convey their emotions and intentions during communication

  • We constructed deep convolutional neural network (CNN)-based facial expression recognition (FER) models to recognize eight common facial expressions from the AffectNet database of facial expression, valence, and arousal computing in the wild [5]

  • By analyzing the complementary nature of weighted loss and center loss, we propose a new loss function, named weighted-cluster loss, which takes highly skewed facial emotion data into consideration, but it uses multiple loss terms to improve the performance of the FER models

Read more

Summary

Introduction

Facial expressions are undoubtedly a dominant, natural, and effective channel used by people to convey their emotions and intentions during communication. The experimental results with the AffectNet dataset [5] show that our transfer learning-based model with weighted-cluster loss outperforms other models that use either weighted softmax-loss or center loss. To address the overfitting problem, a deep transfer-based framework is proposed in which we utilize an SE-ResNet-50 model [21] (which was pre-trained on VGGFace data [22]) for fine-tuning on AffectNet dataset. To alleviate the imbalanced-data problem, we propose a new loss function, named weighted-cluster loss, which gives weights to each emotion class’s loss terms based on their relative proportion of the total number of samples in the training dataset.

Facial Expression Recognition Approaches
Transfer Learning for Facial Expression Recognition
Data Re-Sampling and Augmentation
Weighted Loss and Auxiliary Loss
Methods
Base Models
Pre-Training
Fine-Tuning
Review of Weighted-Softmax Loss
Review of Center Loss
The Proposed Weighted-Cluster Loss
AffectNet Dataset
Evaluation Metrics
Experiment Setups and Implementation Details
Results and Discussions
Threats to Validity
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.