Stacked Capsule Graph Autoencoders for geometry-aware 3D head pose estimation

Chaoqun Hong,Liang Chen,Yuxin Liang,Zhiqiang Zeng

doi:10.1016/j.cviu.2021.103224

Abstract

The goal of image-based 3D head pose estimation is try to estimate the facial direction with 2D images. It is an important attribute widely used in many applications related to faces. However, accurate estimation is hard due to complicated part and pose absence in facial images. Recently, some improvement has been obtained with methods based on neural networks, but most of them ignore the geometric information of facial parts. In this paper, we try to tackle this issue and propose a novel geometry-aware representation. It is based on Stacked Capsule Graph Autoencoders (SCGAE). Different from current methods, we apply Stacked Capsule Autoencoders (SCAE) to encode the parts and poses of facial images. These parts and poses are used to train templates and reconstruct the original faces in decoders. In addition, we improve SCAE with locality loss, in which the inner relationships of similar samples are utilized. To achieve it, graph regularization is applied. In this way, an improved geometry-aware representation can be computed. It is compatible with existing regression methods and experimental results on commonly-used datasets about head pose estimation validate the effectiveness of SCGAE.

Full Text