Abstract

A hash algorithm converts data into compact strings. In the multimedia domain, effective hashing is the key to large-scale similarity search in high-dimensional feature space. A limit of existing hashing techniques is that they typically use single features. In order to improve search performance, it is necessary to utilize multiple features. Due to the compactness requirement, concatenation of hash values from different features is not an optimal solution. Thus a fusion process is desired. In this paper, we solve the multiple feature fusion problem by a hash bit selection framework. Given multiple features, we derive an n-bit hash value of improved performance compared with hash values of the same length computed from each individual feature. The framework utilizes a feature-independent hash algorithm to generate a sufficient number of bits from each feature, and selects n bits from the hash bit pool by leveraging pair-wise label information. The metric bit reliability is used for ranking the bits. It is estimated by bit-level hypothesis testing. In addition, we also take into account the dependence among bits. A weighted graph is constructed for refined bit selection, where the bit reliability is used as vertex weights and the mutual information among hash bits is used as edge weights. We demonstrate our framework with LSH. Extensive experiments confirm that our method is effective, and outperforms several state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call