Abstract

We present a randomized algorithm for the approximate nearest neighbor problem in d-dimensional Euclidean space. Given N points {xj} in Rd, the algorithm attempts to find k nearest neighbors for each of xj, where k is a user-specified integer parameter. The algorithm is iterative, and its CPU time requirements are proportional to T⋅N⋅(d⋅(logd)+k⋅(d+logk)⋅(logN))+N⋅k2⋅(d+logk), with T the number of iterations performed. The memory requirements of the procedure are of the order N⋅(d+k).A byproduct of the scheme is a data structure, permitting a rapid search for the k nearest neighbors among {xj} for an arbitrary point x∈Rd. The cost of each such query is proportional to T⋅(d⋅(logd)+log(N/k)⋅k⋅(d+logk)), and the memory requirements for the requisite data structure are of the order N⋅(d+k)+T⋅(d+N).The algorithm utilizes random rotations and a basic divide-and-conquer scheme, followed by a local graph search. We analyze the schemeʼs behavior for normally distributed points {xj}, and illustrate its performance via several numerical examples.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call