Recursive Spatial Transformer (ReST) for Alignment-Free Face Recognition

Wanglong Wu,Meina Kan,Yi Yang,Xilin Chen,Shiguang Shan,Xin Liu

doi:10.1109/iccv.2017.407

Abstract

Convolutional Neural Network (CNN) has led to significant progress in face recognition. Currently most CNN-based face recognition methods follow a two-step pipeline, i.e. a detected face is first aligned to a canonical one predefined by a mean face shape, and then it is fed into a CNN to extract features for recognition. The alignment step transforms all faces to the same shape, which can cause loss of geometrical information which is helpful in distinguishing different subjects. Moreover, it is hard to define a single optimal shape for the following recognition, since faces have large diversity in facial features, e.g. poses, illumination, etc. To be free from the above problems with an independent alignment step, we introduce a Recursive Spatial Transformer (ReST) module into CNN, allowing face alignment to be jointly learned with face recognition in an end-to-end fashion. The designed ReST has an intrinsic recursive structure and is capable of progressively aligning faces to a canonical one, even those with large variations. To model non-rigid transformation, multiple ReST modules are organized in a hierarchical structure to account for different parts of faces. Overall, the proposed ReST can handle large face variations and non-rigid transformation, and is end-to-end learnable and adaptive to input, making it an effective alignment-free face recognition solution. Extensive experiments are performed on LFW and YTF datasets, and the proposed ReST outperforms those two-step methods, demonstrating its effectiveness.

Full Text