Abstract

Using hashing algorithms to learn binary codes representation of data for fast approximate nearest neighbor (ANN) search has attracted more and more attention. Most existing hashing methods employ various hash functions to encode data. The resulting binary codes can be obtained by concatenating bits produced by those hash functions. These methods usually have two main steps: projection and thresholding. One problem with these methods is that every dimension of the projected data is regarded as of same importance and encoded by one bit, which may result in ineffective codes. In this paper, we introduce an adaptive bit allocation hashing (ABAH) method to encode data for ANN search. The basic idea is, according to the dispersions of all the dimensions after projection we use different numbers of bits to encode them. In our method, more bits will be adaptively allocated to encode dimensions with larger dispersion while fewer bits for dimensions with smaller dispersion. This novel bit allocation scheme makes our hashing method effectively preserve the neighborhood structure in the original data space. Extensive experiments show that the proposed ABAH significantly outperforms other state-of-the-art methods for ANN search task.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call