Loop closure detection (LCD) is an important portion of Simultaneous Localization and Mapping (SLAM) because of its ability to reduce accumulated position errors. In this letter, we propose a novel loop closure detection algorithm named ESA-VLAD. The crucial part of ESA-VLAD is a redesigned network with EfficientNetB0 as backbone for extracting global features, which integrates a second-order attention module in order to effectively learn the correlations between features within the feature map. A trainable Vector of Local Aggregated Descriptors (NetVLAD) is integrated in the last layer of the network to generate a compact and fixed-length global feature. Knowledge distillation strategy is adopted in training of the proposed network to accelerate the training process. For the global features, Hierarchical Navigable Small World (HNSW) is employed to retrieve the loop closure candidate images. In addition, an efficient geometrical consistency check based on local difference binary (LDB) descriptors is designed to verify loop closure matches. Experiments on several public datasets demonstrate that ESA-VLAD can obtain higher recall rates under 100% precision and less processing time per frame compared to other typical and state-of-the-art methods.
Read full abstract