Abstract

Abstract. Image-based localization or camera re-localization is a fundamental task in computer vision and mandatory in the fields of navigation for robotics and autonomous driving or for virtual and augmented reality. Such image pose regression in 6 Degrees of Freedom (DoF) is recently solved by Convolutional Neural Networks (CNNs). However, already well-established methods based on feature matching still score higher accuracies so far. Therefore, we want to investigate how data augmentation could further improve CNN-based pose regression. Data augmentation is a valuable technique to boost performance on training based methods and wide spread in the computer vision community. Our aim in this paper is to show the benefit of data augmentation for pose regression by CNNs. For this purpose images are rendered from a 3D model of the actual test environment. This model again is generated by the original training data set, whereas no additional information nor data is required. Furthermore we introduce different training sets composed of rendered and real images. It is shown that the enhanced training of CNNs by utilizing 3D models of the environment improves the image localization accuracy. The accuracy of pose regression could be improved up to 69.37 % for the position component and 61.61 % for the rotation component on our investigated data set.

Highlights

  • Image localization or re-localization is an important and popular task in the computer vision community

  • Experiments are carried out with the modified VGG16-Net and SqueezePoseNet to investigate the improvement of image localization by data augmentation

  • Besides pose regression by Convolutional Neural Networks (CNNs), experiments on feature matching with a Bag of Visual Words (BoVW) approach are carried out

Read more

Summary

Introduction

Image localization or re-localization is an important and popular task in the computer vision community. We tackle the problem of such pose estimation in 6 Degrees of Freedom (DoF) by utilizing Convolutional Neural Networks (CNNs). Data augmentation is introduced to support the training process, whereas a 3D model of the test environment is utilized to render arbitrary images. Localization or pose estimation is of high interest in the fields of autonomous navigation, robotics and augmented or virtual reality. Multipath effects or shadowing make them vulnerable for safe and continuous navigation. Such navigation frameworks are often fused with local navigation methods, like Inertial Navigation Systems (INSs) or Visual Odometry (VO) to overcome such drawbacks. A promising extension or supplement to existing navigation networks could be introduced by utilizing Convolutional Neural Networks (CNNs)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call