Performance Evaluation of Spatial Data Management Systems Using GeoSpark

Hansub Shin,Hyuk-Yoon Kwon,Kisung Lee

doi:10.1109/bigcomp48618.2020.00-75

Abstract

In this paper, we evaluate the performance of spatial data management systems in distributed computing environments. Given that GeoSpark outperforms other spatial systems in many scenarios as reported in several studies, we choose spatial data management systems using GeoSpark for this evaluation. Even though GeoSpark supports various storage engines as its underlying data store, the effects of the storage engines for spatial data processing have not been well studied. To address this limitation, we evaluate the performance of GeoSpark using two underlying data stores: 1) HDFS and 2) MongoDB. We first design and build distributed experimental environments based on Amazon EC2 and EMR using up to 10 nodes. Through the extensive experiments on three synthetic and real data sets, we show that the overall performance of both HDFS-and MongoDB-based GeoSpark improves as we increase the number of nodes. We also show that HDFS-based GeoSpark generally outperforms MongoDB-based GeoSpark, especially for large-scale data sets. In addition, we demonstrate that the proper use of caching on HDFS-based GeoSpark can improve the overall query processing performance by up to three orders of magnitude.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Performance Evaluation of Spatial Data Management Systems Using GeoSpark

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A comparative experimental study of distributed storage engines for big spatial data processing using GeoSpark.
Hansub Shin ... Kisung Lee
The Journal of Supercomputing | VOL. 78
Hansub Shin, et. al.Hansub Shin ... Kisung Lee
01 Jul 2021
The Journal of Supercomputing | VOL. 78

Spatial Data Management and Analysis System for Flood Hazard Mitigation of Poyang Lake Watershed, China
Jiangzhong Lu ... Shuming Bao
Geographic Information Sciences | VOL. 13
Jiangzhong Lu, et. al.Jiangzhong Lu ... Shuming Bao
01 Dec 2007
Geographic Information Sciences | VOL. 13

An Evaluation of Modern Spatial Libraries
Varun Pandey ... Alfons Kemper
-
Varun Pandey, et. al.Varun Pandey ... Alfons Kemper
01 Jan 2020
01 Jan 2020

How Good Are Modern Spatial Libraries?
Varun Pandey ... Andreas Kipf
Data Science and Engineering | VOL. 6
Varun Pandey, et. al.Varun Pandey ... Andreas Kipf
07 Nov 2020
Data Science and Engineering | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance Evaluation of Spatial Data Management Systems Using GeoSpark

Abstract

Talk to us

Similar Papers