Abstract
Vector-quantization can be a computationally ex- pensive step in visual bag-of-words (BoW) search when the vocabulary is large. A BoW-based appearance SLAM needs to tackle this problem for an efficient real-time operation. We propose an effective method to speed up the vector quantization process in BoW-based visual SLAM. We employ a graph-based nearest neighbor search (GNNS) algorithm to this aim, and experimentally show that it can outperform the state-of-the-art. The graph-based search structure used in GNNS can efficiently be integrated into the BoW model and the SLAM framework. The graph-based index, which is a k-NN graph, is built over the vocabulary words and can be extracted from the BoW's vocabulary construction procedure, by adding one iteration to the k-means clustering, which adds small extra cost. Moreover, exploiting the fact that images acquired for appearance-based SLAM are sequential, GNNS search can be initiated judiciously which helps increase the speedup of the quantization process considerably. I. INTRODUCTION Bag-of-Words (BoW) method was originally proposed for document retrieval. In recent years, the method has been successfully applied to image retrieval tasks in computer vision community (17), (18). The method is attractive be- cause of its efficient image representation and retrieval. BoW represents an image as a sparse vector of visual words, and thus images can be searched efficiently using an inverted index file system. Moreover, because the complexity of BoW does not grow with the size of the dataset, as much as that of other search techniques (e.g., direct feature matching) do, it can be employed for large-scale image search applications. One major application area that benefits from BoW is the appearance-based mobile robot localization and SLAM 1 . SLAM employs BoW to solve the loop closure detection (LCD) problem which is a classic and difficult problem in SLAM. LCD is addressed as a place recognition problem: robot should be able to recognize places it has visited before to localise itself or refine the map of the environment. This task is performed by matching the current view of the robot to the existing map that contains the images of the previously visited locations. In large-scale environments, SLAM maps contain a large number of images to match in order to solve the loop closure detection problem. The image search in such large maps is challenging and still an open problem. Although BoW proposes an efficient search technique, its vector quantiza- tion (VQ) step can be computationally expensive. Vector quantization maps the image feature descriptors to the words
Submitted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have