The existing approaches usually perform facial landmark detection and head pose estimation independently and sequentially, ignoring their coupled relations. We introduce a unified framework, named coupled cascade regression (CCR), for simultaneous facial landmark detection and head pose estimation. Based on the cascade regression framework, we propose to learn two separate regressors to update the landmark locations and three-dimensional (3D) face model parameters at each cascade level. To capture the coupled relations of the landmark locations and head pose, we further apply the 3D face projection model to refine the prediction results in each cascade iteration and make them consistent. CCR can leverage both the learning methods and the projection model to simultaneously perform facial landmark detection and pose estimation to enhance the performances of both tasks. We also propose to learn the cascade regressors from the combination of real and synthesized face images to solve the problem of limited variations in head pose for training. Experimental results on Helen, labeled face parts in the wild, 300-W, and Boston University datasets show that our proposed CCR method outperforms other conventional methods both for landmark detection and head pose estimation.
Read full abstract