Abstract
When mobile robots run in indoor environment, a large number of similar images are easy to appear in the images collected, probably causing false-positive judgment in loop closure detection based on simultaneous localization and mapping (SLAM). To solve this problem, a loop closure detection algorithm for visual SLAM based on image semantic segmentation is proposed in this paper. Specifically, the current frame is semantically segmented by optimized DeepLabv3+ model to obtain semantic labels in the image. The 3D semantic node coordinates corresponding to each semantic label are then extracted by combining mask centroid and image depth information. According to the distribution of semantic nodes, the DBSCAN density clustering algorithm is adopted to cluster densely distributed semantic nodes to avoid mismatching due to the close distance of semantic nodes in the subsequent matching process. Finally, the multidimensional similarity comparison of first rough and then fine is adopted to screen the candidate frames of loop closure from key frames and then confirm the real loop closure to complete accurate loop closure detection. Testing with public datasets and self-filmed datasets, experimental results show that being well adapted to illumination change, viewpoint deviation, and item movement or missing, the proposed algorithm can effectively improve the accuracy of loop closure detection in indoor environment.
Highlights
Simultaneous localization and mapping (SLAM) is one of the research hotspots in the field of robotics
In order to verify the effectiveness of the loop closure detection proposed in this paper, the NYUv2 dataset is used to pretrain the optimized Deeplabv3+ semantic segmentation model, and on the public datasets of TUM RGB-D and SUN RGB-D and self-filmed dataset, the proposed algorithm is compared with stacked denoising autoencoder (SDA) [18], CNN-W [21], DBoW [36], and Conv3 [37] in detail
SUN RGB-D dataset [39] is a public dataset published by the Vision & Robotics Group of e University of Princeton, which provides a total of 10335 RGBD images, and these images are captured in environments such as universities, houses, and furniture stores in North America and Asia and can be used for tasks ranging from semantic segmentation to object detection to scene recognition. e images contained in this dataset have a resolution of 640 × 480, and the bit depth is 16
Summary
Simultaneous localization and mapping (SLAM) is one of the research hotspots in the field of robotics. Erefore, it is necessary to introduce loop closure detection during robot movement to eliminate the accumulated errors of visual odometer in SLAM [2]. E loop closure detection method based on deep learning is to extract various graphic features with the help of deep neural network and calculate the similarity between features to judge whether there is a loop closure. E method of extracting image features by the convolutional neural network has attracted extensive attention, and it outperforms the traditional loop closure detection method in multiple tasks. Semantic segmentation is applied to loop closure detection algorithms because it is less sensitive to illumination change and object shape and can accurately segment the current environment to obtain accurate image information.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have