AAGF: An Efficient Transformer With Mix-Features For Visual Place Recognition

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Visual Place Recognition (VPR) is a task predicting the current location solely based on the visual features of images. It is susceptible to changes in perspective, lighting, and environmental conditions. Now the performance of the VPR method still relies on re-ranking, and the effectiveness of pure global retrieval is not ideal. To address this, we introduce a novel feature aggregation model based on the Transformer architecture, Agent-Attention with Gating Forward, which can aggregate the global relationships from feature maps obtained by a pre-trained backbone into a new global feature. Besides, a valid training strategy, Mix-Features Data Augment, is proposed to enhance the diversity of features and make the model more robust. Through experiments on multiple benchmarks, we demonstrate that our approach outperforms many existing techniques in terms of lightweight pre-trained backbone network aggregation.

Save Icon
Up Arrow
Open/Close