Abstract

Finding nearest neighbors (NN) is a fundamental operation in many diverse domains such as databases, machine learning, data mining, information retrieval, multimedia retrieval, etc. Due to the data deluge and the application of nearest neighbor queries in many applications where fast performance is necessary, efficient index structures are required to speed up finding nearest neighbors. Different application domains have different data characteristics and, therefore, require different types of indexing techniques. While the internal indexing and searching mechanism is generally hidden from the top-level application, it is beneficial for a data scientist to understand these fundamental operations and choose a correct indexing technique to improve the performance of the overall end-to-end workflow. Choosing the correct searching mechanism to solve a nearest neighbor query can be a daunting task, however. A wrong choice can potentially lead to low accuracy, slower execution time, or in the worst case, both. The objective of this tutorial is to present the audience with the knowledge to choose the correct index structure for specific applications. We present the state-of-the-art Nearest Neighbor (NN) indexing techniques for different data characteristics. We also present the effect, in terms of time and accuracy, of choosing the wrong index structure for different application needs. We conclude the tutorial with a discussion on the future challenges in the Nearest Neighbor search domain.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.