Abstract

In the facial expression recognition task, a good-performing convolutional neural network (CNN) model trained on one dataset (source dataset) usually performs poorly on another dataset (target dataset). This is because the feature distribution of the same emotion varies in different datasets. To improve the cross-dataset accuracy of the CNN model, we introduce an unsupervised domain adaptation method, which is especially suitable for unlabelled small target dataset. In order to solve the problem of lack of samples from the target dataset, we train a generative adversarial network (GAN) on the target dataset and use the GAN generated samples to fine-tune the model pretrained on the source dataset. In the process of fine-tuning, we give the unlabelled GAN generated samples distributed pseudolabels dynamically according to the current prediction probabilities. Our method can be easily applied to any existing convolutional neural networks (CNN). We demonstrate the effectiveness of our method on four facial expression recognition datasets with two CNN structures and obtain inspiring results.

Highlights

  • Facial expressions recognition (FER) has a wide spectrum of application potentials in human-computer interaction, cognitive psychology, computational neuroscience, and medical healthcare

  • We train on VGG11 with FER-2013 as the source dataset and JAFFE as the target dataset to examine our method on a different convolutional neural networks (CNN) structure, and the recognition accuracy increases by 15.02%

  • The experiment results have shown that our method can improve the CNN model’s recognition accuracy on the target dataset with different datasets as well as different CNN structures

Read more

Summary

Introduction

Facial expressions recognition (FER) has a wide spectrum of application potentials in human-computer interaction, cognitive psychology, computational neuroscience, and medical healthcare. Convolutional neural networks (CNN) have achieved many exciting results in artificial intelligent and pattern recognition and have been successfully used in facial expression recognition [1]. Jaiswal et al [2] present a novel approach to facial action unit detection using a combination of Convolutional and Bidirectional Long Short-Term Memory Neural Networks (CNN-BLSTM), which jointly learns shape, appearance, and dynamics in a deep learning manner. Neagoe et al [6] propose a model for subject independent emotion recognition from facial expressions using combined CNN and DBN. These CNN models are often trained and tested on the same dataset, whereas the cross-dataset performance is less concerned. In order to make the facial expression recognition system more practical, it is necessary to improve the generalization ability of the recognition model

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call