Abstract

This paper presents an indoor relocalization system using a dual-stream convolutional neural network (CNN) with both color images and depth images as the network inputs. Aiming at the pose regression problem, a deep neural network architecture for RGB-D images is introduced, a training method by stages for the dual-stream CNN is presented, different depth image encoding methods are discussed, and a novel encoding method is proposed. By introducing the range information into the network through a dual-stream architecture, we not only improved the relocalization accuracy by about 20% compared with the state-of-the-art deep learning method for pose regression, but also greatly enhanced the system robustness in challenging scenes such as large-scale, dynamic, fast movement, and night-time environments. To the best of our knowledge, this is the first work to solve the indoor relocalization problems based on deep CNNs with RGB-D camera. The method is first evaluated on the Microsoft 7-Scenes data set to show its advantage in accuracy compared with other CNNs. Large-scale indoor relocalization is further presented using our method. The experimental results show that 0.3 m in position and 4° in orientation accuracy could be obtained. Finally, this method is evaluated on challenging indoor data sets collected from motion capture system. The results show that the relocalization performance is hardly affected by dynamic objects, motion blur, or night-time environments. Note to Practitioners —This paper was motivated by the limitations of the existing indoor relocalization technology that is significant for mobile robot navigation. Using this technology, robots can infer where they are in a previously visited place. Previous visual localization methods can hardly be put into wide application for the reason that they have strict requirements for the environments. When faced with challenging scenes such as large-scale environments, dynamic objects, motion blur caused by fast movement, night-time environments, or other appearance changed scenes, most existing methods tend to fail. This paper introduces deep learning into the indoor relocalization problem and uses dual-stream CNN (depth stream and color stream) to realize 6-DOF pose regression in an end-to-end manner. The localization error is about 0.3 m and 4° in a large-scale indoor environments. And what is more important, the proposed system does not lose efficiency in some challenging scenes. The proposed encoding method of depth images can also be adopted in other deep neural networks with RGB-D cameras as the sensor.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call