Abstract

The k-Nearest Neighbor (k-NN) graphs are widely used in data mining and machine learning. How to construct a high quality k-NN graph for generic similarity measures efficiently is crucial for many applications. In this paper, we propose a new approach to effectively and efficiently construct an approximate k-NN graph. Our framework is as follows: (1) generate a random k-NN graph approximation, G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">f</sub> , (2) perform random hierarchical partitions of the space to construct an approximate neighborhood graph G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p</sub> , which is then combined with graph G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">f</sub> to yield a more accurate graph G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">m</sub> , (3) neighborhood propagation is conducted on G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">m</sub> to further enhance the accuracy, and output the solution as graph G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">f</sub> , (4) repeat the process of (2) and (3) several times until a reasonable solution is reached. The experiments on a variety of real data sets and a high intrinsic dimensional synthetic data set verify the high performance of the proposed method and demonstrate that it is superior to previous state-of-the-art k-NN graph construction approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call