Abstract

We propose a novel bag-of-words (BoW) framework for building and retrieving a compact database of view images for use in robotic localization, mapping, and SLAM applications. Unlike most previous methods, our method does not describe an image based on its many small local features (e.g., bag-of-SIFT-features). Instead, the proposed bag-of-bounding-boxes (BoBB) approach attempts to describe an image based on fewer larger object patterns, which leads to a semantic and compact image descriptor. To make the view retrieval systemmore practical and autonomous, the object pattern discovery process is unsupervised through a common pattern discovery (CPD) between the input and known reference images without requiring the use of a pre-trained object detector. Moreover, our CPD subtask does not rely on good image segmentation techniques and is able to handle scale variations by exploiting the recently developed CPD technique, i.e., a spatial randompartition. Following a traditional bounding-box based object annotation and knowledge transfer, we compactly describe an image in a BoBB form. Using a slightly modified inverted file system, we efficiently index and/or search for the BoBB descriptors. Experiments using the publicly available “Robot-Car” dataset show that the proposed method achieves accurate object-level view image retrieval using significantly compact image descriptors, e.g., 20 words per image.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call