Multi-attention based Deep Neural Network with hybrid features for Dynamic Sequential Facial Expression Recognition

Xiao Sun,Pingping Xia,Fuji Ren

doi:10.1016/j.neucom.2019.11.127

Abstract

In interpersonal communication, the expression is an import way to express one’s emotions. In order to make computers understand facial expressions like human beings, a large number of researchers have put a lot of time and energy into it. But for now, most of the work of dynamic sequence facial expression recognition fails to make full use of the combined advantages of shallow features (prior knowledge) and depth features (high-level semantic). Therefore, this paper implements a dynamic sequence facial expression recognition system that integrates shallow features and deep features with the attention mechanism. In order to extract the shallow features, an Attention Shallow Model (ASModel) is proposed by using the relative position of facial landmarks and the texture characteristics of the local area of the face to describe the Action Units of the Facial Action Coding System. And with the advantage of the deep convolutional neural network in expressing high-level features, a Attention Deep Model (ADModel) is also designed to extract deep features on sequence facial images. Finally, the ASModel and the ADModel are integrated to a Multi-attention Shallow and Deep Model (MSDModel) to complete the dynamic sequence facial expression recognition. There are three kinds of attention mechanism introduced, such as Self-Attention (SA), Weight-Attention (WA), and Convolution-Attention (CA). We verify our dynamic expression recognition system on three publicly available databases include CK+, MMI, and Oulu-CASIA and get superior performance than other state-of-art results.

Full Text