Abstract
Abstract. Visual Relocalization is a key technology in many computer vision applications. Traditional visual relocalization is mainly achieved through geometric methods, while PoseNet introduces convolutional neural network in visual relocalization for the first time to realize real-time camera pose estimation based on a single image. Aiming at the problem of accuracy and robustness of the current PoseNet algorithm in complex environment, this paper proposes and implements a new high-precision robust camera pose calculation method (LRF-PoseNet). This method directly adjusts the size of the input image without cropping, so as to increase the receptive field of the training image. Then, the image and the corresponding pose tags are input into the improved LSTM-based PoseNet network for training, and the Adam optimizer is used to optimize the network. Finally, the trained network is used to estimate the camera pose. Experimental results on open RGB dataset show that the proposed method in this paper can obtain more accurate camera pose compared with the existing CNN-based methods.
Highlights
Visual relocalization plays a key role in photogrammetric computer vision, autopilot and robotics (Husain, 2019; Acharya, 2019; Wang, 2020; Pham, 2021)
This paper proposes a high-precision camera pose estimation method LRF-PoseNet
It can be seen from the figure that the error point of the improved method in this paper is closer to the origin point, that is, the position error and the orientation error are smaller, which shows that the method in this paper has significantly improved the accuracy of the PoseNet method
Summary
Visual relocalization plays a key role in photogrammetric computer vision, autopilot and robotics (Husain, 2019; Acharya, 2019; Wang, 2020; Pham, 2021). The traditional geometry based visual relocation method is mainly realized by local feature matching which is based on the known 3D environment created by SfM (Sattler, 2017; Han, 2019) or SLAM (Mur-Artal, 2015; Mur-Artal, 2017). It matches the local 2D feature points extractedfrom the query image with the corresponding 3D feature points in the model to establish the corresponding relationship (Bay, 2006; Lowe, 2004; Rublee, 2011), and solves the camera posewith six degrees of freedom by PnP and other algorithms (Lepetit, 2009; Hesch, 2011). Since the matching cost increases exponentially with respect to the number of key points, the cost of matching in a large and dense feature space is very large
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.