Abstract

There is a recent outbreak in the amounts of spatial data generated by different sources, e.g., smart phones, space telescopes, and medical devices, which urged researchers to exploit the existing distributed systems to process such amounts of spatial data. However, as these systems are not designed for spatial data, they cannot fully utilize its spatial properties to achieve high performance. In this paper, we describe SpatialHadoop, a full-fledged MapReduce framework which extends Hadoop to support spatial data efficiently. SpatialHadoop consists of four main layers, namely, language, indexing, query processing , and visualization. The language layer provides a high level language with standard spatial data types and operations to make the system accessible to non-technical users. The indexing layer supports standard spatial indexes, such as grid, R-tree and R+-tree, inside Hadoop file system in order to speed up spatial operations. The query processing layer encapsulates the spatial operations supported by SpatialHadoop such as range query, k nearest neighbor, spatial join and computational geometry operations. Finally, the visualization layer allows users to produce images that describe very large datasets to make it easier to explore and understand big spatial data. SpatialHadoop is already used as a main component in several real systems such as MNTG, TAREEG, TAGHREED, and SHAHED.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call