Distributed online similarity search in high dimensional space

Baohui Li Baohui Li,Kefu Xu Kefu Xu,Hongtao Xie Hongtao Xie

doi:10.1109/bigcomp.2014.6741437

Abstract

In this paper, we consider distributed on-line similarity search for big data in high dimensional spaces, for which Locality Sensitive Hashing (LSH) was the method of choice. But LSH scheme needs a rather large number of hash tables and optimal parameters. So, it is difficult for LSH to deal with big data in one machine. To reduce the size of big data, we divide the dataset into well separated clusters with bounded aspect ratios, locating them in different peers in ring network, using random projection tree(RP-tree). To limit the number of network accesses, we put similar subgroups adjacent to each other. Then, we construct one LSH hash table for each subgroup using optimal parameters. It is shown by comprehensive performance evaluations using real world data that our approach decreases the network cost and brings major performance improvement, while maintaining a good load balance between different machines.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Distributed online similarity search in high dimensional space

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Research on image retrieval algorithm based on LBP and LSH
Hongliang Wu ... Weimin Wu
-
Hongliang Wu, et. al.Hongliang Wu ... Weimin Wu
01 Jan 2017
01 Jan 2017

Frequency Based Locality Sensitive Hashing
Kang Ling ... Gangshan Wu
-
Kang Ling, et. al. Kang Ling ... Gangshan Wu
01 Jul 2011
01 Jul 2011

Scalability and Total Recall with Fast CoveringLSH
Ninh Pham ... Rasmus Pagh
-
Ninh Pham, et. al.Ninh Pham ... Rasmus Pagh
24 Oct 2016
24 Oct 2016

Spherical LSH for Approximate Nearest Neighbor Search on Unit Hypersphere
Kengo Terasawa ... Yuzuru Tanaka
-
Kengo Terasawa, et. al.Kengo Terasawa ... Yuzuru Tanaka
15 Aug 2007
15 Aug 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Distributed online similarity search in high dimensional space

Abstract

Talk to us

Similar Papers