Abstract

Combining global and local features is an essential solution to improve discriminative performances in facial expression recognition tasks. The limitations of existing methods are that they cannot extract crucial local features and ignore the complementary effects of local and global features. To address these problems, this paper proposes a Weakly Supervised Local-Global Attention Network (WS-LGAN), which uses the attention mechanism to deal with part location and feature fusion problems. Firstly, an Attention Map Generator is designed to get a set of attention maps under weak supervision. It mimics the attention mechanism of human brain and quickly finds the local regions-of-interest. Secondly, bilinear attention pooling is employed to generate and refine local features based on attention maps. Thirdly, a building block called Selective Feature Unit is designed. It allows adaptive weighted fusion of global and local features before making classification. In WS-LGAN, global and local features represent expressions from different aspects. Compared with methods relying on single type of feature, it benefits from local-global complementary advantages. Additionally, contrastive loss is introduced for both local and global features to increase inter-class dispersion and intra-class compactness under different granularities. Experiments on three popular facial expression datasets, including two lab-controlled facial expression datasets and one real-world facial expression dataset show that WS-LGAN achieves state-of-the-art performance, which demonstrates our superiority in facial expression recognition.

Highlights

  • Facial expression is a fundamental manner of transporting human emotions and takes on a significant part in our daily communication

  • Motivated by the process and the prior knowledge, we propose a Weakly Supervised Local-Global Attention Network (WS-LGAN) to learn global representations and, at the same time, learn local features around eyes and mouth to facilitate local-enhanced facial expression recognition

  • Our facial expression recognition solution achieves state-of-the-art results on CK+, Oulu-CASIA and RAF-DB with accuracies of 98.06%, 88.26% and 85.07%, respectively

Read more

Summary

Introduction

Facial expression is a fundamental manner of transporting human emotions and takes on a significant part in our daily communication. Facial expression recognition is a complex but interesting problem, and finds its extensive applications in fatigue surveillance [1], human-machine interaction [2], patient care [3], neuromarketing [4] and interactive games [5] etc. Facial expression recognition has received substantial attention among the researchers in computer vision, affective computing and human computer interaction fields. Despite great success has been achieved in recent years [6]–[8], accurate facial expression recognition is still challenging. It is mainly due to the complexity and variability of facial expressions. We summarize the obstacles as follows: The associate editor coordinating the review of this manuscript and approving it for publication was Mohammad Ayoub Khan

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call