Abstract
The nearest neighbor search problem in general dimensions finds application in computational geometry, computational statistics, pattern recognition, and machine learning. Although there is a significant body of work on theory and algorithms, surprisingly little work has been done on algorithms for high-end computing platforms, and no open source library exists that can scale efficiently to thousands of cores. In this paper, we present algorithms and a library built on top of the message passing interface (MPI) and OpenMP that enable nearest neighbor searches to hundreds of thousands of cores for arbitrary-dimensional datasets. The library supports both exact and approximate nearest neighbor searches. The latter is based on iterative, randomized, and greedy KD-tree ($k$-dimensional tree) searches. We describe novel algorithms for the construction of the KD-tree, give complexity analysis, and provide experimental evidence for the scalability of the method. In our largest runs, we were able to perform an al...
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have