Learning Local Responses of Facial Landmarks with Conditional Variational Auto-Encoder for Face Alignment

Shuying Liu,Jiani Hu,Yipeng Huang,Weihong Deng

doi:10.1109/fg.2017.117

Shuying Liu, Jiani Hu + Show 2 more

https://doi.org/10.1109/fg.2017.117

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

This work proposes a novel convolutional neural network architecture which can locate landmarks accurately by learning local responses of facial landmarks. The network consists of a Conditional Variational Auto-Encoder(CVAE) and a Deep Convolutional Neural Network(DCNN). The CVAE is used to learn the response maps of facial landmarks from face images and the DCNN is used to learn accurate landmark locations from the response maps and facial textures. The CVAE consists of a face encoder, which extracts high-level information from raw pixels, and a decoder which outputs local response maps from high-level coding. We derive the CVAE used for catching local responses as an optimization problem, which can be solved through back-propagation. Extensive experiments show that the proposed CVAE can learn better local response maps than Fully Convolutional Network(FCN). Our method outperforms state-of-the-art methods on AFLW(5 points) and the challenging subset of 300-W(68 points), which means our method shows advantages in the condition of complex poses and expressions.

Full Text