High-Performance Geospatial Big Data Processing System Based on MapReduce

Junghee Jo,Kang-Woo Lee

doi:10.3390/ijgi7100399

Abstract

With the rapid development of Internet of Things (IoT) technologies, the increasing volume and diversity of sources of geospatial big data have created challenges in storing, managing, and processing data. In addition to the general characteristics of big data, the unique properties of spatial data make the handling of geospatial big data even more complicated. To facilitate users implementing geospatial big data applications in a MapReduce framework, several big data processing systems have extended the original Hadoop to support spatial properties. Most of those platforms, however, have included spatial functionalities by embedding them as a form of plug-in. Although offering a convenient way to add new features to an existing system, the plug-in has several limitations. In particular, while executing spatial and nonspatial operations by alternating between the existing system and the plug-in, additional read and write overheads have to be added to the workflow, significantly reducing performance efficiency. To address this issue, we have developed Marmot, a high-performance, geospatial big data processing system based on MapReduce. Marmot extends Hadoop at a low level to support seamless integration between spatial and nonspatial operations of a solid framework, allowing improved performance of geoprocessing workflow. This paper explains the overall architecture and data model of Marmot as well as the main algorithm for automatic construction of MapReduce jobs from a given spatial analysis task. To illustrate how Marmot transforms a sequence of operators for spatial analysis to map and reduce functions in a way to achieve better performance, this paper presents an example of spatial analysis retrieving the number of subway stations per city in Korea. This paper also experimentally demonstrates that Marmot generally outperforms SpatialHadoop, one of the top plug-in based spatial big data frameworks, particularly in dealing with complex and time-intensive queries involving spatial index.

Highlights

In the environment of the Internet of Things (IoT), various sensors have been mounted on objects in diverse domains, generating huge volumes of data at high speed [1,2]
Where SH and M are execution times required by SpatialHadoop and Marmot, respectively
SpatialHadoop was not designed to read Shapefiles directly, which is a very popular geospatial vector data format used in spatial domain

Summary

Introduction

In the environment of the Internet of Things (IoT), various sensors have been mounted on objects in diverse domains, generating huge volumes of data at high speed [1,2]. A significant portion of sensor big data is geospatial data describing objects in relation to geographic information [3,4]. Geospatial big data refers to geographic data sets that cannot be processed using standard computing systems [3,4,5]. The United Nations Initiative on Global Geospatial Information Management (UN-GGIM) reported that 2.5 quintillion bytes of data is created every day and a significant portion includes location components [10].

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ISPRS International Journal of Geo-Information	Publication Date: Oct 6, 2018
Citations: 20	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

High-Performance Geospatial Big Data Processing System Based on MapReduce

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: ISPRS International Journal of Geo-Information

Lead the way for us

Similar Papers

Big Geospatial Data or Geospatial Big Data? A Systematic Narrative Review on the Use of Spatial Data Infrastructures for Big Geospatial Sensing Data in Public Health
Keumseok Koh ... Maged N Kamel Boulos
Remote Sensing | VOL. 14
Keumseok Koh, et. al.Keumseok Koh ... Maged N Kamel Boulos
23 Jun 2022
Remote Sensing | VOL. 14

APPLICATION AND PLATFORM DESIGN OF GEOSPATIAL BIG DATA
H Li ... J Yang
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. XLIII-B4-2021
H Li, et. al.H Li ... J Yang
30 Jun 2021
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. XLIII-B4-2021

Review on Integrating Geospatial Big Datasets and Open Research Issues
Sohaib Al-Yadumi ... Sharon Goh Wei Wei
IEEE Access | VOL. 9
Sohaib Al-Yadumi, et. al.Sohaib Al-Yadumi ... Sharon Goh Wei Wei
01 Jan 2020
IEEE Access | VOL. 9

A new data science framework for analysing and mining geospatial big data
Mo Saraee ... Charith Silva
-
Mo Saraee, et. al.Mo Saraee ... Charith Silva
20 Apr 2018
20 Apr 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

High-Performance Geospatial Big Data Processing System Based on MapReduce

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: ISPRS International Journal of Geo-Information