A novel algorithm for scalable k-nearest neighbour graph construction

Youngki Park,Heasoo Hwang,Sang-Goo Lee

doi:10.1177/0165551515594728

Abstract

Finding the k-nearest neighbours of every node in a dataset is one of the most important data operations with wide application in various areas such as recommendation and information retrieval. However, a major challenge is that the execution time of existing approaches grows rapidly as the number of nodes or dimensions increases. In this paper, we present greedy filtering, an efficient and scalable algorithm for finding an approximate k-nearest neighbour graph. It selects a fixed number of nodes as candidates for every node by filtering out node pairs that do not have any matching dimensions with large values. Greedy filtering achieves consistent approximation accuracy across nodes in linear execution time. We also present a faster version of greedy filtering that uses inverted indices on the node prefixes. Through theoretical analysis, we show that greedy filtering is effective for datasets whose features have Zipfian distribution, a characteristic observed in majority of large datasets. We also conduct extensive comparative experiments against (a) three state-of-the-art algorithms, and (b) three algorithms in related research domains. Our experimental results show that greedy filtering consistently outperforms other algorithms in various types of high-dimensional datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A novel algorithm for scalable k-nearest neighbour graph construction

Abstract

Talk to us

Similar Papers

More From: Journal of Information Science

Lead the way for us

Journal: Journal of Information Science	Publication Date: Jul 22, 2015
Citations: 7

Similar Papers

Scalable k-nearest neighbor graph construction based on greedy filtering
Youngki Park ... Sang-Goo Lee
-
Youngki Park, et. al.Youngki Park ... Sang-Goo Lee
13 May 2013
13 May 2013

Fast Collaborative Filtering with a k-nearest neighbor graph
Youngki Park ... Woosung Jung
-
Youngki Park, et. al.Youngki Park ... Woosung Jung
01 Jan 2014
01 Jan 2014

Greedy Filtering: A Scalable Algorithm for K-Nearest Neighbor Graph Construction
Youngki Park ... Sang-Goo Lee
-
Youngki Park, et. al.Youngki Park ... Sang-Goo Lee
01 Jan 2014
01 Jan 2014

Hybrid OpenMP/MPI programs for solving the time-dependent Gross–Pitaevskii equation in a fully anisotropic trap
Bogdan Satarić ... Sadhan K Adhikari
Computer Physics Communications | VOL. 200
Bogdan Satarić, et. al.Bogdan Satarić ... Sadhan K Adhikari
22 Dec 2015
Computer Physics Communications | VOL. 200

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel algorithm for scalable k-nearest neighbour graph construction

Abstract

Talk to us

Similar Papers

More From: Journal of Information Science