Abstract

Most Facial Expression Recognition (FER) systems rely on machine learning approaches that require large databases for an effective training. As these are not easily available, a good solution is to augment the databases with appropriate data augmentation (DA) techniques, which are typically based on either geometric transformation or oversampling augmentations (e.g., generative adversarial networks (GANs)). However, it is not always easy to understand which DA technique may be more convenient for FER systems because most state-of-the-art experiments use different settings which makes the impact of DA techniques not comparable. To advance in this respect, in this paper, we evaluate and compare the impact of using well-established DA techniques on the emotion recognition accuracy of a FER system based on the well-known VGG16 convolutional neural network (CNN). In particular, we consider both geometric transformations and GAN to increase the amount of training images. We performed cross-database evaluations: training with the "augmented" KDEF database and testing with two different databases (CK+ and ExpW). The best results were obtained combining horizontal reflection, translation and GAN, bringing an accuracy increase of approximately 30%. This outperforms alternative approaches, except for the one technique that could however rely on a quite bigger database.

Highlights

  • Facial Expression Recognition (FER) is a challenging task involving scientists from different research fields, such as psychology, physiology and computer science, whose importance has been growing in the last years due to the vast areas of possible applications, e.g., human–computer interaction, gaming and healthcare

  • The Karolinska Directed Emotional Faces (KDEF) database is a set of 4900 pictures of human facial expressions with associated emotion label

  • We experimented on the use of generative adversarial network (GAN)-based data augmentation (DA), combined with other geometric DA techniques, for FER purposes

Read more

Summary

Introduction

Facial Expression Recognition (FER) is a challenging task involving scientists from different research fields, such as psychology, physiology and computer science, whose importance has been growing in the last years due to the vast areas of possible applications, e.g., human–computer interaction, gaming and healthcare. Numerous studies have been conducted on FER, it remains one of the hardest tasks for image classification systems due to the following main reasons:. Facial features from one subject in two different expressions can be very close in the features space; facial features from two subjects with the same expression may be very far from each other [4]. For these reasons, cross-database analyses are preferred to improve the validity of FER systems, i.e., training the system with one database and testing with another one

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call