Abstract

Recent head pose estimation techniques are advanced by performing bin classification, where the predicted result is compared against a one-hot classification vector. We argue that the head poses may better be modelled by discrete distribution sampled from a smooth continuous curve rather than one-hot coding or some other kinds of binned classification vector, since pose angles in practice are arbitrary. In this paper, we propose a deep head pose estimation scheme by regressing between predicted probabilistic labels and discrete Gaussian distribution. Such Gaussian distribution aims at modelling the arbitrary state of true head poses and supervises the deep network through maximum mean discrepancy loss. Besides, we also propose a spatial channel-aware residual attention structure for enhancing intrinsic pose features to further improve the prediction accuracy and speed up training convergence. Experiments on two public datasets AFLW2000 and BIWI show the proposed method outperforms all previous methods, and its individual components yield substantial improvements.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call