Abstract

Convolutional neural networks (CNNs) exhibit excellent performance on the head pose estimation problem under controllable conditions, but their generalization ability in the wild needs to be improved. To address this issue, we propose an approach involving the introduction of facial landmark information into the task simplifier and landmark heatmap generator constructed before the feed-forward neural network, which can use this information to normalize the face shape into a canonical shape and generate a landmark heatmap based on the transformed facial landmarks to assist in feature extraction, for enhancing generalization ability in the wild. Our method was trained on 300W-LP and tested on AFLW2000-3D. The result shows that for the same feed-forward neural network when our method is used to introduce facial landmark information into a CNN, accuracy improves from 88.5% to 99.0% and mean average error decreases from 5.94° to 1.46° on AFLW2000-3D. Furthermore, we evaluate our method on several datasets used for pose estimation and compare the result with AFLW2000-3D, finding that the features extracted by a CNN could not reflect the head pose efficiently, which limits the performance of the CNN on the head pose estimation problem in wild. By introducing facial landmarks, the CNN could extract features that reflect head pose more efficiently, thereby significantly improving the accuracy of head pose estimation in the wild.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call