Abstract
Accurate and robust camera pose estimation is essential for high-level applications such as augmented reality and autonomous driving. Despite the development of global feature-based camera pose regression methods and local feature-based matching guided pose estimation methods, challenging conditions, such as illumination changes and viewpoint changes, as well as inaccurate keypoint localization, continue to affect the performance of camera pose estimation. In this paper, we propose a novel relative camera pose regression framework that uses global features with rotation consistency and local features with rotation invariance. First, we apply a multi-level deformable network to detect and describe local features, which can learn appearances and gradient information sensitive to rotation variants. Second, we process the detection and description processes using the results from pixel correspondences of the input image pairs. Finally, we propose a novel loss that combines relative regression loss and absolute regression loss, incorporating global features with geometric constraints to optimize the pose estimation model. Our extensive experiments report satisfactory accuracy on the 7Scenes dataset with an average mean translation error of 0.18 m and a rotation error of 7.44° using image pairs as input. Ablation studies were also conducted to verify the effectiveness of the proposed method in the tasks of pose estimation and image matching using the 7Scenes and HPatches datasets.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have