Dual-position features fusion for head pose estimation for complex scene

Xiaoliang Zhu,Qiaolai Yang,Liang Zhao,Zhicheng Dai,Zili He,Wenting Rong

doi:10.1016/j.ijleo.2022.169986

Abstract

Head pose estimation (HPE) is widely used in attention detection, behavior analysis, and expression recognition. Nevertheless, in some complex scenes (such as facial occlusion, large head deflection angle, and multi-person in one scene), HPE still has the problem of low estimation accuracy. To solve this problem, we propose a dual position feature fusion method for estimating head pose. First, the RGB input is replaced with a standard luminance, which reduces the effect of extraneous light factors. Subsequently, the center offset loss is used to detect the head and body position, and dynamic adjustment strategy is used to deflate the border, aiming to not only obtain the best confidence level but also improve the capability of multi-person HPE. Finally, the estimate results under head position and body position are fused to further reduce the estimate loss. We tested our approach on the popular public AFLW2000, BIWI, and UPNA datasets, the results show the superiority of our approach in solving the occlusion, deflection, and multi-person scene problems.

Full Text