Visual Place Recognition (VPR), the task of identifying the place where an image has been taken from, is at the core of important robotic problems as relocalization, loop-closure detection or topological navigation. Even for indoors, the focus of this work, VPR is challenging for a number of reasons, including real-time performance when dealing with large image databases ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\sim10^{4}$</tex-math></inline-formula> ) (probably captured by different robots), or the avoidance of Perceptual Aliasing in environments with repetitive structures and scenes. In this letter, we tackle these issues by proposing an off-line mapping technique that abstracts a dense database of georeferenced images without particular order into a Multivariate Gaussian Mixture Model, by creating soft clusters in terms of their similarity in both pose and appearance. This abstract representation is obtained through an Expectation-Maximization algorithm and plays the role of a simplified map. Since querying this map yields a probability of being in a cluster, we exploit this “belief” within a Bayesian filter that regards previous query images and a topological map between clusters to perform more robust VPR. We evaluate our proposal in two different indoor datasets, demonstrating comparable VPR precision to querying the full database while incurring in shorter query times and handling Perceptual Aliasing for sequential navigation.
Read full abstract