Abstract

SpatialHadoop is an extended MapReduce framework supporting global indexing techniques that partition spatial datasets across several machines and improve spatial query processing performance compared to traditional Hadoop systems. SpatialHadoop supports several spatial operations (e.g., K Nearest Neighbor search, range query, spatial intersection join, etc.) and seven spatial partitioning techniques (Grid, Quadtree, STR, STR+, k-d tree, Z-curve and Hilbert-curve). Distance-Join Queries (DJQs), like the K Nearest Neighbors Join Query (KNNJQ) and K Closest Pairs Query (KCPQ), are common operations used in numerous spatial applications. DJQs are costly operations, since they combine spatial joins with distance-based search. Data partitioning improves the management of large datasets and speeds up query performance. Therefore, performing DJQs efficiently with new partitioning methods in SpatialHadoop is a challenging task. In this paper, a new data partitioning technique based on Voronoi-Diagrams is designed and implemented in SpatialHadoop. Moreover, improved KNNJQ and KCPQ MapReduce algorithms, using the new partitioning mechanism, are also designed and developed for SpatialHadoop. Finally, the results of an extensive set of experiments with real-world datasets are presented, demonstrating that the new partitioning technique and the improved DJQ MapReduce algorithms are efficient, scalable and robust in SpatialHadoop.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.