Hashing for Localization (HfL): A Baseline for Fast Localizing Objects in a Large-Scale Scene

Lirong Han,Antonio Plaza,Peng Li,Peng Ren

doi:10.1109/tgrs.2021.3114207

Abstract

Advanced remote-sensing instruments produce massively large scenes from the surface of the earth, with very high spatial resolution and dimensionality. Developing methods for efficiently localizing specific objects in a large-scale scene presents a significant challenge, mainly because of the high computational requirements involved. To tackle this issue, we propose a new hashing for localization (HfL) framework that efficiently searches for specific objects in the large-scale scene. It begins by dividing the scene into a large number of overlapping local patches. A lightweight deep hash model, referred to as a tiny hashing network (THNet), encodes the local patches into hash codes. The Hamming distances between the hash code of an object image, i.e., an image containing the specific class of objects to be localized in the scene, and those of all local patches are computed. Small values of the Hamming distance indicate local patches that are similar to the object image. The positions of these local patches in the large-scale scene reflect the regional locations of the specific objects. The hash codes are binary and do not take up much space, and the Hamming distance carries very low-computational overheads. Further, we exploit a class center loss as the THNet training objective, which can comprehensively manage multiple object classes. These features mean that the HfL framework can localize specific objects very quickly, regardless of the size of the scene. Extensive experiments validate the effectiveness and efficiency of the framework. For instance, HfL can find objects in a remote-sensing image of 19584 <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\times$ </tex-math></inline-formula> 19584 pixels in only 4.388 s (on a single RTX2080ti), with remarkable localization results. The source codes and datasets are available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/lrhan/HfL</uri> , together providing a baseline for fast localizing objects in a large-scale scene.

Full Text