Abstract

• Propose a collaborative learning framework for head pose estimation. • Learn complementary information from the landmark-based and landmark-free branches. • Design a Landmark-MLP-Mixer to obtain the head pose angles from facial landmarks. • Introduce a union loss function to optimize two complementary branches. • Conduct comparison experiments and ablation studies on various datasets. Head pose estimation is an important task in many real-world applications, such as human–computer interaction, driver monitoring, face localization and gaze estimation. In this paper, we present a novel collaborative learning framework based on Convolutional Neural Networks (CNNs) for head pose estimation. The proposed framework consists of a landmark-based branch and a landmark-free branch. The former first estimates facial landmarks and then follows the Landmark-MLP-Mixer module which models the complex nonlinear mapping relationship from facial landmarks to head pose angles. While the later adopts a label distribution learning strategy to estimate head pose. The two branches both dedicate themselves to head pose estimation task, and they collaborate with each other for mutual promotion and complementary semantic learning. Specifically, we introduce a dual-branch transfer module in the middle of the network to achieve explicit semantic interaction and introduce a multi-loss strategy that induces to implicit information interaction. We conduct extensive experiments on several popular benchmarks, including AFLW, AFLW2000 and BIWI, the results show that our method is competitive compared to other state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call