Real-time 6DoF full-range markerless head pose estimation

Redhwan Algabri,Hyunsoo Shin,Sungon Lee

doi:10.1016/j.eswa.2023.122293

Redhwan Algabri, Hyunsoo Shin + Show 1 more

Open Access

https://doi.org/10.1016/j.eswa.2023.122293

Copy DOI

Journal: Expert Systems With Applications	Publication Date: Nov 7, 2023
License type: cc-by-nc-nd

Affiliation: Hanyang University

Abstract

Head pose estimation is a fundamental function for several applications in human–computer interactions. Accurate six degrees of freedom head pose estimation (6DoF-HPE) with full-range angles make up most of these applications, which require sequential images of the human head as input. Most existing head pose estimation methods focus on a three degrees of freedom (3DoF) frontal head, which restricts their applications in real-world scenarios. This study presents a framework designed to estimate a head pose without landmark localization. The novelty of our framework is to estimate the 6DoF head poses under full-range angles in real-time. The proposed framework leverages deep neural networks to detect human heads and predict their angles using single shot multibox detector (SSD) and RepVGG-b1g4 backbone, respectively. This work uses red, green, blue, and depth (RGB-D) data to estimate the rotational and translational components relative to the camera pose. The proposed framework employs a continuous representation to predict the angles and a multi-loss approach to update the loss functions for the training strategy. The regression function combines the geodesic loss with the mean squared error. The ground-truth labels were extracted from the public dataset Carnegie Mellon university (CMU) Panoptic for full head angles. This study provides a comprehensive comparison with state-of-the-art methods using public benchmark datasets. Experiments demonstrate that the proposed method achieves or outperforms state-of-the-art methods. The code and datasets are available at: (https://github.com/Redhwan-A/6DoFHPE).

Full Text