Abstract
This work addresses the problem of automatic head pose estimation and its application in 3D gaze estimation using low quality RGB-D sensors without any subject cooperation or manual intervention. The previous works on 3D head pose estimation using RGB-D sensors require either an offline step for supervised learning or 3D head model construction, which may require manual intervention or subject cooperation for complete head model reconstruction. In this paper, we propose a 3D pose estimator based on low quality depth data, which is not limited by any of the aforementioned steps. Instead, the proposed technique relies on modeling the subject’s face in 3D rather than the complete head, which, in turn, relaxes all of the constraints in the previous works. The proposed method is robust, highly accurate and fully automatic. Moreover, it does not need any offline step. Unlike some of the previous works, the method only uses depth data for pose estimation. The experimental results on the Biwi head pose database confirm the efficiency of our algorithm in handling large pose variations and partial occlusion. We also evaluated the performance of our algorithm on IDIAP database for 3D head pose and eye gaze estimation.
Highlights
Head pose estimation is a key step in understanding human behavior and can have different interpretations depending on the context
The previous works on head pose estimation can be divided into two categories: (i) the methods based on 2D images; and (ii) depth data [1]
This work addressed the problem of automatic facial pose and gaze estimation without subject cooperation or manual intervention using low quality depth data provided by the Microsoft Kinect
Summary
Head pose estimation is a key step in understanding human behavior and can have different interpretations depending on the context. From the computer vision point of view, head pose estimation is the task of inferring the direction of head from digital images or range data compared to the imaging sensor coordinate system. The previous works on head pose estimation can be divided into two categories: (i) the methods based on 2D images; and (ii) depth data [1]. The pose estimators based on 2D images generally require some pre-processing steps to translate the pixel-based representation of the head into some direction cues. Several challenges such as camera distortion, projective geometry, lighting or changes in facial expression exist in 2D image-based head pose estimators. A comprehensive study of pose estimation is given in [1] and the reader can refer to this reference for more details on the literature
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.