Abstract

Cross-view geo-localization is to localize the same geographic target in images from different perspectives, e.g., satellite-view and drone-view. The primary challenge faced by existing methods is the large visual appearance changes across views. Most previous work utilizes the deep neural network to obtain the discriminative representations and directly uses them to accomplish the geo-localization task. However, these approaches ignore that the redundancy retained in the extracted features negatively impacts the result. In this paper, we argue that the information bottleneck (IB) can retain the most relevant information while removing as much redundancy as possible. The variational self-distillation (VSD) strategy provides an accurate and analytical solution to estimate the mutual information. To this end, we propose to learn discriminative representations via variational self-distillation (dubbed LDRVSD). Extensive experiments are conducted on two widely-used datasets University-1652 and CVACT, showing the remarkable performance improvements obtained by our LDRVSD method compared with several state-of-the-art approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call