Abstract

Collecting correlated scene images and camera poses is an essential step towards learning absolute camera pose regression models. While the acquisition of such data in living environments is relatively easy by following regular roads and paths, it is still a challenging task in constricted industrial environments. This is because industrial objects have varied sizes and inspections are usually carried out with non-constant motions. As a result, regression models are more sensitive to scene images with respect to viewpoints and distances. Motivated by this, we present a simple but efficient camera pose data collection method, WatchPose, to improve the generalization and robustness of camera pose regression models. Specifically, WatchPose tracks nested markers and visualizes viewpoints in an Augmented Reality- (AR) based manner to properly guide users to collect training data from broader camera-object distances and more diverse views around the objects. Experiments show that WatchPose can effectively improve the accuracy of existing camera pose regression models compared to the traditional data acquisition method. We also introduce a new dataset, Industrial10, to encourage the community to adapt camera pose regression methods for more complex environments.

Highlights

  • Camera pose estimation is a fundamental task in SimultaneousLocalization and Mapping (SLAM) and Augmented Reality (AR) applications [1,2,3]

  • The contributions of this work are summarized as follows: (1) We propose a novel training data collection method called WatchPose to improve the performance of camera pose regression in industrial environments

  • We introduced a simple but efficient data collection method for complex industrial environments named WatchPose so as to learn effective absolute camera pose regression models

Read more

Summary

Introduction

Camera pose (location and orientation) estimation is a fundamental task in SimultaneousLocalization and Mapping (SLAM) and Augmented Reality (AR) applications [1,2,3]. Instead of using machine learning for only specific parts of the estimation pipeline [14,15,16,17], these methods aim to learn the full pipeline with a set of training images and their corresponding poses. The trained models directly regress the camera pose from an input image. Several works [4,7] in the literature report that those methods are plausible in regular living environments (e.g., along the street or path), achieving around a 9∼25 m and 4∼17◦ accuracy in localization and orientation, respectively. The utility of such methodology is limited in industrial environments. Traditional data collection methods cannot cover enough viewpoints to properly train a generalized pose regression model. Viewpoints in industrial environments may be restricted to Sensors 2020, 20, 3045; doi:10.3390/s20113045 www.mdpi.com/journal/sensors

Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.