Abstract
Simultaneous localization and mapping (SLAM) is mainly used to solve the problem of mobile robot navigation in an unknown environment. The loop closure detection is a key step in visual SLAM and plays an important role in building consistent maps and reducing pose cumulative errors. A new loop closure detection method based on the bag of semantic word (BOSW) is proposed in this paper, extracting semantic descriptors by using a neural network pre-trained on the scene recognition data set. Traditionally, the visual bag of word (BoW) model adopts the features such as ORB and SURF which is easily affected by the environment. With advantages of better ability to deal with changes in lighting, the convolutional neural network can extract the abstract features of the image. For these reasons, we proposed a new algorithm that combines the bag of word model and the convolutional neural network. The feature maps, extracted by the VGG16 neural network which is trained on the dataset named Places365 (VGG16-Places365), are used as the semantic descriptors instead of ORB descriptors. The workflow of building the vocabulary of the semantic words, extracting the feature vector of the image and computing the similarity score is presented. The performance of BOSW is evaluated by a comparison study with the traditional visual bag of word model and Inception-based deep learning method which use Inception's fully connected layer output as the feature vector of the picture. Overall, the experiments on New College datasets verified that our mode show stronger generalization and can achieve higher accuracy in the loop closure detection.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have