A Learning to Tune Framework for LSH

Xiu Tang,Wei Cao,Zhifei Pang,Sai Wu,Jinyang Gao,Gang Chen

doi:10.1109/icde51399.2021.00224

Abstract

Nearest neighbor (NN) search in high-dimensional spaces is inherently computationally expensive due to the curse of dimensionality. As a well-known solution to approximate NN search, locality-sensitive hashing (LSH) is able to answer c-approximate NN (c-ANN) queries in sublinear time with a well-defined performance bound. The success of LSH family mainly depends on the design of randomly projected hash functions. However, instead of randomly drawing hash functions from a conventional hashing family such as Gaussian projection for Euclidean space, we argue that whether there could be a set of data sensitive hashing functions with higher capacity to distinguish nearby points and far away points, which could have rigorous performance guarantee like conventional LSH. To this end, we propose a learning to tune framework, called LSH-tuning, which consists of a pruning model and a learning to rank model. The pruning model reduces the total number of hash tables to maximize the separating capacity on the given data distribution and minimize the storage overhead. The learning to rank model ranks hash tables based on their effectiveness on NN retrieval. We also have a theoretic model that guides us to gradually search more hash tables and probe nearby buckets. Extensive experiments with real-world data demonstrate that LSH-tuning is capable of outperforming existing proposals with respect to both efficiency and storage overhead.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Learning to Tune Framework for LSH

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

HashFile: An efficient index structure for multimedia data
Dongxiang Zhang ... Anthony K H Tung
-
Dongxiang Zhang, et. al.Dongxiang Zhang ... Anthony K H Tung
01 Apr 2011
01 Apr 2011

Efficient and accurate nearest neighbor and closest pair search in high-dimensional space
Yufei Tao ... Panos Kalnis
ACM Transactions on Database Systems | VOL. 35
Yufei Tao, et. al.Yufei Tao ... Panos Kalnis
01 Jul 2010
ACM Transactions on Database Systems | VOL. 35

Preserving-Ignoring Transformation Based Index for Approximate k Nearest Neighbor Search
Gang Hu ... Dongxiang Zhang
-
Gang Hu, et. al.Gang Hu ... Dongxiang Zhang
01 Apr 2017
01 Apr 2017

PM-LSH: a fast and accurate in-memory framework for high-dimensional approximate NN and closest pair search
Bolong Zheng ... Xi Zhao
The VLDB Journal | VOL. 31
Bolong Zheng, et. al.Bolong Zheng ... Xi Zhao
03 Jul 2021
The VLDB Journal | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Learning to Tune Framework for LSH

Abstract

Talk to us

Similar Papers