A Big Spatial Data Processing Framework Applying to National Geographic Conditions Monitoring

F Xiao

doi:10.5194/isprs-archives-xlii-3-1945-2018

Abstract

Abstract. In this paper, a novel framework for spatial data processing is proposed, which apply to National Geographic Conditions Monitoring project of China. It includes 4 layers: spatial data storage, spatial RDDs, spatial operations, and spatial query language. The spatial data storage layer uses HDFS to store large size of spatial vector/raster data in the distributed cluster. The spatial RDDs are the abstract logical dataset of spatial data types, and can be transferred to the spark cluster to conduct spark transformations and actions. The spatial operations layer is a series of processing on spatial RDDs, such as range query, k nearest neighbor and spatial join. The spatial query language is a user-friendly interface which provide people not familiar with Spark with a comfortable way to operation the spatial operation. Compared with other spatial frameworks, it is highlighted that comprehensive technologies are referred for big spatial data processing. Extensive experiments on real datasets show that the framework achieves better performance than traditional process methods.

Highlights

The storage layer supports persistent spatial data either on local disk or Hadoop file system (HDFS), but HDFS is recommended for using in cluster environment
Each block is represented by the minimum boundary rectangle (MBR) of it records, and all the partition blocks are concatenated into a global R-tree index using their MBRs as the index key by bulk loading process
This paper introduced a new Apache Spark-based framework for spatial data processing is proposed, which includes 4 layers: spatial data storage, spatial Resilient Distributed Datasets (RDDs), spatial operations, and spatial query language

Summary

INTRODUCTION

It is implemented on top of Apache Spark and deeply leverages modern database techniques like efficient data layout, code generation and query optimization in order to optimize geospatial queries It support the full suite of OpenGIS Simple Features for SQL spatial predicate functions and operators together with additional topological functions. Another software development kit for processing big spatial data with Apache Spark is SparkSpatialSDK (Shangguan, Yue, Wu, 2017), a fast and. A novel Apache Spark based computing framework for spatial data is introduced It leverages Spark as the under layer to achieve better computing performance than Hadoop. The differences to other Hadoop and Spark spatial compute frameworks are the close integration, both logical and physical, between Hadoop HDFS and Spark spatial RDDs

DETAILS

Spatial Spark SQL Language

Storage Layer

Build index

Index File Structure

Spatial RDDs Layer

Spatial Operations Layer

EXPERIMENTS

PERFORMANCE COMPARISON

Findings

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences	Publication Date: Apr 30, 2018
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Big Spatial Data Processing Framework Applying to National Geographic Conditions Monitoring

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences

Lead the way for us

Similar Papers

A SPARK BASED COMPUTING FRAMEWORK FOR SPATIAL DATA
F Xiao
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. IV-4/W2
F XiaoF Xiao
19 Oct 2017
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. IV-4/W2

SpatialHadoop
Ahmed Eldawy
-
Ahmed EldawyAhmed Eldawy
18 Jun 2014
18 Jun 2014

Efficient Spatial Big Data Storage and Query in HBase
Lihua Duan ... Ping Wang
-
Lihua Duan, et. al.Lihua Duan ... Ping Wang
01 Dec 2019
01 Dec 2019

SpatialHadoop: A MapReduce framework for spatial data
Ahmed Eldawy ... Mohamed F Mokbel
-
Ahmed Eldawy, et. al.Ahmed Eldawy ... Mohamed F Mokbel
01 Apr 2015
01 Apr 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Big Spatial Data Processing Framework Applying to National Geographic Conditions Monitoring

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences