Abstract

Loop closure detection is a key challenge in visual simultaneous localization and mapping (SLAM) systems, which has attracted significant research interest in recent years. It entails correctly determining whether a scene has previously been visited by a mobile robot and completely establishing the consistent maps of motion. There are many loop closure detection methods that have been proposed, but most of these algorithms are handcrafted features-based and perform weak robustness to illumination variations. In this paper, we investigate a Siamese Convolutional Neural Network (SCNN) to solve the task of loop closure detection in RGB-D SLAM. Firstly, we use a pre-trained SCNN model to extract features as image descriptors; then, the L2 norm distance is adopted as a similarity metric between descriptors. In terms of the learned features for matching, there are two key issues for discussion: (1) how to define an appropriate loss as supervision (utilizing the cross-entropy loss, the contrastive loss, or the combination of two); and (2) how to combine the appearance information in RGB images and position information in depth images (utilizing early fusion, mid-level fusion or late fusion). We compare our proposed method of different baseline by experiments carried out on two public datasets (New College and NYU), and our performance outperforms the state-of-the-art.

Highlights

  • Visual simultaneous localization and mapping (SLAM) is one of the fundamental problems in robotics with numerous important applications, such as robot motion [1,2], trajectory planning [3], driving recorder [4], etc

  • We investigate the performance of three fusion strategies level fusion, and late for the RGB and depth stream at different locations of the Siamese Convolutional Neural Network (SCNN), which are early fusion, In early fusion, RGB images and depth images are combined at the beginning of the mid-level fusion, and late fusion

  • In mid-level fusion, two streams are fused at the intermediate layers of the netIn early fusion, RGB images and depth images are combined at the beginning of works while late fusion, two streams feature descriptors respectively, and the SCNN

Read more

Summary

Introduction

Visual simultaneous localization and mapping (SLAM) is one of the fundamental problems in robotics with numerous important applications, such as robot motion [1,2], trajectory planning [3], driving recorder [4], etc. Loop closure detection is a key challenge in the task of graph-based visual SLAM. It aims at determining whether a robot has visited a location (such as an office, corridor, or library as we can see in Figure 1) previously arrived at, and is vital for the generation of a consistent map by correcting errors that accumulate overtimes. The problem of visual loop closure detection shares similar ideas with image retrieval; significant distinctions exist between these two visual tasks. The task of loop closure detection has been approached from various angles, all solutions are based on matching and share a common framework: 4.0/)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call