Abstract

Locality-sensitive hashing (LSH) has gained ever-increasing popularity in similarity search for large-scale data. It has competitive search performance when the number of generated hash bits is large, reversely bringing adverse dilemmas for its wide applications. The first purpose of this work is to introduce a novel hash bit reduction schema for hashing techniques to derive shorter binary codes, which has not yet received sufficient concerns. To briefly show how the reduction schema works, the second purpose is to present an effective bit reduction method for LSH under the reduction schema. Specifically, after the hash bits are generated by LSH, they will be put into bit pool as candidates. Then mutual information and data labels are exploited to measure the correlation and structural properties between the hash bits, respectively. Eventually, highly correlated and redundant hash bits can be distinguished and then removed accordingly, without deteriorating the performance greatly. The advantages of our reduction method include that it can not only reduce the number of hash bits effectively but also boost retrieval performance of LSH, making it more appealing and practical in real-world applications. Comprehensive experiments were conducted on three public real-world datasets. The experimental results with representative bit selection methods and the state-of-the-art hashing algorithms demonstrate that the proposed method has encouraging and competitive performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.