Scalable Execution of KNN Queries using Data Parallelism Approach

Kalpana V Metre,M U Kharat

doi:10.14419/ijet.v7i4.19.28286

Kalpana V Metre, M U Kharat

Open Access

PDF Available

https://doi.org/10.14419/ijet.v7i4.19.28286

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

In recent years, real-time data-oriented applications such as sensor networks, telecommunications data management, network monitoring are required to process various continuous queries on unbounded data streams. A lot of work has been done to deal with the computational complications in constant processing of continuous queries on unbounded, continuous data stream. The K-nearest neighbor algorithm (KNN) is a well-known learning method used in a wide range of problem-solving domains e.g., network monitoring, data mining, and image processing etc. The efficient and scalable processing of multiple continuous queries on dynamic data items requires query indexing and data indexing. Query processing algorithms used on static databases are not well suited to handle dynamic continuous queries over high dimensional data sets. It is better to build the index for queries which is finite rather than to build the index for data which is infinite. A divide-and-conquer approach is used for indexing and searching for K-nearest neighbors. The approach significantly will reduce the space complexity and will scale well with the increasing data size. The hybrid indexing approach using grid and a K-dimensional tree will reduce the space cost as well searching cost. The data parallelism will provide scalability of continuous queries over high-volume streams.

Full Text