A randomized approximate nearest neighbors algorithm

Peter W Jones,Andrei Osipov,Vladimir Rokhlin

doi:10.1016/j.acha.2012.07.003

Peter W Jones, Andrei Osipov + Show 1 more

Open Access

https://doi.org/10.1016/j.acha.2012.07.003

Copy DOI

Journal: Applied and Computational Harmonic Analysis	Publication Date: Jul 20, 2012
Citations: 14	License type: elsevier-specific

Affiliation: Yale University

Abstract

We present a randomized algorithm for the approximate nearest neighbor problem in d-dimensional Euclidean space. Given N points {xj} in Rd, the algorithm attempts to find k nearest neighbors for each of xj, where k is a user-specified integer parameter. The algorithm is iterative, and its CPU time requirements are proportional to T⋅N⋅(d⋅(logd)+k⋅(d+logk)⋅(logN))+N⋅k2⋅(d+logk), with T the number of iterations performed. The memory requirements of the procedure are of the order N⋅(d+k).A byproduct of the scheme is a data structure, permitting a rapid search for the k nearest neighbors among {xj} for an arbitrary point x∈Rd. The cost of each such query is proportional to T⋅(d⋅(logd)+log(N/k)⋅k⋅(d+logk)), and the memory requirements for the requisite data structure are of the order N⋅(d+k)+T⋅(d+N).The algorithm utilizes random rotations and a basic divide-and-conquer scheme, followed by a local graph search. We analyze the schemeʼs behavior for normally distributed points {xj}, and illustrate its performance via several numerical examples.

Full Text