Abstract

Efficient kNN search, or k-nearest neighbors search, is useful, among other fields, in multimedia information retrieval, data mining and pattern recognition problems. A distance function determines how similar the objects are to a given kNN query object. As finding the distance between any given pair of objects (i.e., high-dimensional vectors) is known to be a computationally expensive operation, using parallel computation techniques is an effective way of reducing running times to acceptable values in large databases. In the present work, we offer novel GPU approaches to solving kNN (k-nearest neighbor) queries using exhaustive algorithms based on the Selection Sort, Quicksort and state-of-the-art algorithms. We show that the best approach depends on the k value of the kNN query and achieve a speedup up to 86.4 $$\times $$ better than the sequential counterpart. We also propose a multi-core algorithm to be used as reference for the experiments and a hybrid algorithm which combines the proposed algorithms with a state-of-the-art heaps-based method, in which the best performance is obtained with high k values. We also extend our algorithms to be able to deal with large databases that do not fit in GPU memory and whose performance does not deteriorate as database size increases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call