Abstract

Many facial expression recognition (FER) methods have now achieved satisfactory results. However, some facial expressions have similar muscle deformation, making them easy to confuse. These confusable facial expressions are a key challenge to accurately recognizing facial expressions. In addition, most current FER methods rely on the convolution operation, but convolution is a building block that processes one local neighborhood at a time; thus, it fails to capture the geometric patterns that are important for facial muscle deformation. To address this issue, considering the problems of pose variations and insufficient training data, this paper proposes a geometry-aware conditional network (GACN) that captures long-range dependencies for simultaneous pose-invariant facial expression editing and geometry-aware FER. Specifically, the GACN can complete a pose-invariant image editing task with long-range dependency by introducing conditional self-attention operations to a generative adversarial network. Moreover, the GACN presents non-local operations as building blocks of the classifier to capture the texture and geometry patterns simultaneously. Finally, these two tasks can further boost each other’s performances through our GACN, and confusable facial expressions can be effectively distinguished. And we overcome the effect of pose variations while expanding and enriching the training set. Our proposed algorithm is evaluated on both the in-the-lab and in-the-wild datasets and outperforms the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call