Adaptive bit allocation hashing for approximate nearest neighbor search

Qin-Zhen Guo,Zhi Zeng,Shuwu Zhang

doi:10.1016/j.neucom.2014.10.042

Abstract

Using hashing algorithms to learn binary codes representation of data for fast approximate nearest neighbor (ANN) search has attracted more and more attention. Most existing hashing methods employ various hash functions to encode data. The resulting binary codes can be obtained by concatenating bits produced by those hash functions. These methods usually have two main steps: projection and thresholding. One problem with these methods is that every dimension of the projected data is regarded as of same importance and encoded by one bit, which may result in ineffective codes. In this paper, we introduce an adaptive bit allocation hashing (ABAH) method to encode data for ANN search. The basic idea is, according to the dispersions of all the dimensions after projection we use different numbers of bits to encode them. In our method, more bits will be adaptively allocated to encode dimensions with larger dispersion while fewer bits for dimensions with smaller dispersion. This novel bit allocation scheme makes our hashing method effectively preserve the neighborhood structure in the original data space. Extensive experiments show that the proposed ABAH significantly outperforms other state-of-the-art methods for ANN search task.

Full Text