Abstract Accurately estimating human head pose poses a significant challenge across various application domains. To address the inherent limitations of previous approaches, this research proposes an unconstrained head pose estimation strategy. The method combines deep learning with rotation matrices, utilizing nine-dimensional vectors output by the neural network, which are projected back to rotation matrices in SO (3) space through singular value decomposition. This ensures both the smoothness and uniqueness of the rotation representation. The approach demonstrates distinct advantages in handling the rotation estimation task, particularly when the rotated representation is used as the model output. It not only avoids the discontinuity and double-coverage issues associated with prior methods but also enhances the stability of the representation in high-dimensional space, thereby improving the learning process. Additionally, the geodesic loss function is incorporated to train the network. The proposed strategy surpasses previous state-of-the-art methods, as evidenced by experiments conducted on the AFLW2000 and BIWI datasets.
Read full abstract