Abstract

The well-known k-nearest neighbors problem (kNN) involves building a data structure that reports the k closest training points to each of a given set of query points, with all points being in a given metric space S. The problem discussed here is an important operation in rotation-invariant image processing. It consists of a nontrivial variant of kNN: given a set of training points X and a set of query points Y find, for each query point y∈Y, the k nearest training points to y, where the notion of distance is given by a pseudometric of S defined over cyclic permutations of y and the elements of X. The multiplicity of the query point permutations makes serial brute force search too costly for instances of practical size. We present a transformation that enables any instance of the variant to be solved as a kNN problem. Although this enables the application of any kNN algorithm, the transformation is too time costly to be practical for instances of practical dimensions. For this reason, we present a condensation algorithm for the efficient elimination of unfavorable training points (that cannot be among the k closest neighbors of y) and an effective parallel programming approach based on the discrete Fourier transform. The significant speedup gained over brute force search on practical datasets is reported. We also provide the mathematical basis of a conjecture that, if true, would enable the speedup to be significantly improved. The application of the approach to classification by support vector machines and k-means clustering is also discussed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.