Abstract

SummaryLocality Sensitive Hashing (LSH) uses randomized method to alleviate Nearest Neighbor Search issue in high dimensional spaces. However, handling of big dataset samples for LSH algorithm becomes difficult task because of computational complexity. So, the major aim of this work is to introduce a new LSH algorithm with Hadoop MapReduce framework for enhancing proficiency of arbitrary reads over big dataset samples. The proposed Hash index improves efficiency by reducing the amount of accessing data for range queries by creating buckets based on hyperplanes. A LSH on MapReduce is developed, which decreases the random data access time among map and reduce functions, in addition, it enhances proficiency. Lastly, with the aim of validating the performance of presented index for search query in MapReduce, five performance metrics such as changing cluster size, LSH for Bucket size Balancing, the overlapped boundary of a hyperplane, Bucket creation based on the configured capacity, and non‐indexed, Hash index, and global indexed dataset on the HDFS configured capacity are utilized. The effect of these metrics on dataset on the HDFS configured capacity for the period of map and reduce functions as well depicts the pre‐eminence of the presented Hash index.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.