Abstract

LiDAR-based place recognition (LPR) aims to localize autonomous vehicles and mobile robots relative to pre-built maps or retrieve previously visited places. However, the complexity of real-world scenes and changes in viewpoint are significant challenges for place recognition. As high-level information, semantics makes it easier to distinguish geometrically similar scene situations. Unlike most existing methods that rely solely on a single type of information (geometric or semantic) to construct scene descriptors, we consider the complementary nature of the semantic and geometric information and propose a semantics-enhanced discriminative feature learning method for LPR. Specifically, we first develop a Multi-layer Fusion Feature Extraction Network (MFFEN) based on the transformer encoder to hierarchically fuse local geometric and semantic information and utilize contextual information for extracting discriminative local features. To obtain semantic information, we introduce the dynamic graph convolution network to extract local semantic features with local relations. In addition, to weaken the interference of redundancy and dynamic objects in the scene, we design a semantics-guided local attention network (SLAN) to focus on salient local features that are helpful for recognizing scenes, thereby enhancing the descriptive ability of the global descriptor. Extensive experiments on public datasets KITTI and KITTI-360 demonstrate that the proposed method performs better than recent LiDAR-based methods on the 3D place recognition task. For instance, it achieves the mean F1max score of 96.9% on the KITTI dataset, surpassing the strongest prior model by 2.7%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call