Fast Algorithm for Approximate k-Nearest Neighbor Graph Construction

Dilin Wang,Jianwen Cao,Lei Shi

doi:10.1109/icdmw.2013.50

Abstract

The k-Nearest Neighbor (k-NN) graphs are widely used in data mining and machine learning. How to construct a high quality k-NN graph for generic similarity measures efficiently is crucial for many applications. In this paper, we propose a new approach to effectively and efficiently construct an approximate k-NN graph. Our framework is as follows: (1) generate a random k-NN graph approximation, G f , (2) perform random hierarchical partitions of the space to construct an approximate neighborhood graph G p , which is then combined with graph G f to yield a more accurate graph G m , (3) neighborhood propagation is conducted on G m to further enhance the accuracy, and output the solution as graph G f , (4) repeat the process of (2) and (3) several times until a reasonable solution is reached. The experiments on a variety of real data sets and a high intrinsic dimensional synthetic data set verify the high performance of the proposed method and demonstrate that it is superior to previous state-of-the-art k-NN graph construction approaches.

Full Text