Abstract

Due the fast growth of new technology application like social media analysis, web data analysis and medical information network analysis, here the various types of data are processed frequently. The large amount of effective data management and analysis is very vital goal. To reduce the data processing complexity, time complexity, and space complexity in Big Data, the paper going to propose the k-nearest neighbor join (KNN) operation. KNN is used to find the K nearest points in S. It is a computational task that will handle the large range of applications such as knowledge discovery or data mining. When the volume and the dimension of data increases, then only distributed approaches can perform the big operations in a given time. Recent works have done on implementing the efficient solutions using the map reduce programming model because it is used for distributing the large scale data processing. Although these works provide different solutions for the same problem, each one has particular constraints and properties. This paper compares the existing of different computation of KNN on MapReduce. First the paper compares the solutions in to three steps for KNN computation on MapReduce: 1) Data processing, 2) Data partitioning and 3) Computation. The Experiment in this paper explains the variety of different data sets, and analyzes the data volume, data dimension and the value of k from many perspectives like time and space complexity, and accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.