Accurate localization between cameras is a prerequisite for a vision-based heterogeneous robot systems task. The core issue is how to accurately perform place recognition from different view-points. Traditional appearance-based methods have a high probability of failure in place recognition and localization under large view-point changes. In recent years, semantic graph matching-based place recognition methods have been proposed to solve the above problem. However, these methods rely on high-precision semantic segmentation results and have a high time complexity in node extraction or graph matching. In addition, methods only utilize the semantic labels of the landmarks themselves to construct graphs and descriptors, making such approaches fail in some challenging scenarios (e.g., scene repetition). In this paper, we propose a graph-matching method based on a novel landmark topology descriptor, which is robust to view-point changes. According to the experiment on real-world data, our algorithm can run in real-time and is approximately four times and three times faster than state-of-the-art algorithms in the graph extraction and matching phases, respectively. In terms of place recognition performance, our algorithm achieves the best place recognition precision at a recall of 0-70% compared with classic appearance-based algorithms and an advanced graph-based algorithm in the scene of significant view-point changes. In terms of positioning accuracy, compared to the traditional appearance-based DBoW2 and NetVLAD algorithms, our method outperforms by 95%, on average, in terms of the mean translation error and 95% in terms of the mean RMSE. Compared to the state-of-the-art SHM algorithm, our method outperforms by 30%, on average, in terms of the mean translation error and 29% in terms of the mean RMSE. In addition, our method outperforms the current state-of-the-art algorithm, even in challenging scenarios where the benchmark algorithms fail.
Read full abstract