Abstract
Camera localization is an essential technique in many applications, such as robot navigation, mixed reality, and unmanned vehicle. We are committed to solving the problem of predicting the 6-DoF pose of cameras from a single color image in a given three-Dimensional (3D) environment. In this paper, we proposed a robust learning model for it. Basically, our proposed methodology consists of two steps: image retrieval and space reprojection. The former is in charge of simultaneous localization and mapping based on pre-captured reference images that rely on the correspondence between pixel points and scene coordinates; whereas the latter carries out camera calibration between the 2D image plane and the 3D scene. Given a two-Dimensional (2D) image, the initial localization is accomplished rapidly by matching a reference image using Siamese networks. More precise localization is achieved by camera calibration between the 2D image and the 3D scene using a fully convolutional network. The experimental results on the public dataset show that our model is more robust and expandable than the previous methods. At the end of this paper, we also apply the system to Unmanned Aerial Vehicle (UAV) localization and achieve good results.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have