Abstract

Abstract. The burst of large-scale spatial terrain data due to the proliferation of data acquisition devices like 3D laser scanners poses challenges to spatial data analysis and computation. Among many spatial analyses and computations, polygon retrieval is a fundamental operation which is often performed under real-time constraints. However, existing sequential algorithms fail to meet this demand for larger sizes of terrain data. Motivated by the MapReduce programming model, a well-adopted large-scale parallel data processing technique, we present a MapReduce-based polygon retrieval algorithm designed with the objective of reducing the IO and CPU loads of spatial data processing. By indexing the data based on a quad-tree approach, a significant amount of unneeded data is filtered in the filtering stage and it reduces the IO overhead. The indexed data also facilitates querying the relationship between the terrain data and query area in shorter time. The results of the experiments performed in our Hadoop cluster demonstrate that our algorithm performs significantly better than the existing distributed algorithms.

Highlights

  • Cloud computing is continually being improved for computational geometry, such as the operations commonly used in GIS

  • We present a distributed polygon retrieval algorithm based on MapReduce

  • We evaluate the effectiveness of our polygon retrieval algorithm by varying the size of the Hadoop cluster in terms of the number of VMs such as 5, 10, 20

Read more

Summary

INTRODUCTION

Cloud computing is continually being improved for computational geometry, such as the operations commonly used in GIS. The challenges for processing polygon retrieval in a large terrain dataset include how to organize, partition and distribute very large spatial datasets across 10s or 100s of nodes in a cloud datacenter so that the applications can query and analyze the data very quickly and cost-effectively. To address these challenges, we first index the data based on a quad-tree, which is simpler compared with the R-tree index(Eldawy and Mokbel, 2013).

RELATED WORKS
MAPREDUCE-BASED POLYGON RETRIEVAL ALGORITHM
EXPERIMENT
Dataset and Experiment Environment
Algorithm Efficiency
Scalability
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call