Abstract
Locality-sensitive hashing (LSH) has emerged as the dominant algorithmic technique for similarity search with strong performance guarantees in high-dimensional spaces. A drawback of traditional LSH schemes is that they may have false negatives, i.e., the recall is less than 100%. This limits the applicability of LSH in settings requiring precise performance guarantees. Building on the recent theoretical construction that eliminates false negatives, we propose a fast and practical covering LSH scheme for Hamming space called Fast CoveringLSH (fcLSH). Inheriting the design benefits of CoveringLSH our method avoids false negatives and always reports all near neighbors. Compared to CoveringLSH we achieve an asymptotic improvement to the hash function computation time from O(dL) to O(d + (LlogL), where d is the dimensionality of data and L is the number of hash tables. Our experiments on synthetic and real-world data sets demonstrate that fcLSH is comparable (and often superior) to traditional hashing-based approaches for search radius up to 20 in high-dimensional Hamming space.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.