Abstract

Recently, spatial data became one of the most interesting fields related to big data studies, in which the spatial data have been generated and consumed from different resources. However, the increasing numbers of location-based services and applications such as Google Maps, vehicle navigation, recommendation systems are the main foundation of the idea of spatial data. On the other hand, several researchers started to discover and compared spatial frameworks to understand the requirements for spatial database processing, manipulating, and analysis systems. Apache Spark, Apache Ignite, and Hadoop are the most widely known frameworks for large data processing. However, Apache Spark, Apache Ignite have integrated different spatial data operations and analysis queries, but each system has its advantages and disadvantages when dealing with spatial data. Dealing with a new framework or system that needs to integrate new functionality sometimes becomes a risky decision if we did not examine it well The main aim of this research is to conduct a comprehensive evaluation of big spatial data computing on two well-known data management systems Apache Ignite and Apache Spark. The comparative has been done on four different domains, experimental environment setup, supported features, supported functions and queries, and performance and execution time. The results show that GeoSpark has recorded more flexibility to use than SpatialIgnite. We thoroughly investigated and discovered that multiple factors affect the performance of both frameworks, such as CPU, Main memory, data set size the complexity of data type, and programming environment. spark is more advanced and equipped with several functionalities that made it well suitable with spatial data queries and indexing. such as kNN queries; in which these functionalities are not supported in SpatialIgnite.

Highlights

  • Big data processing has always been a critical research area in both academia and industry

  • 4) Performance and Execution Time: According to the results and assumptions produced by Md Mahbub Alam [12], which proposed that SpatialIgnite has addressed the best performance among SaptialHadoop and GeoSpark, and from this research observation GeoSpark has been widely used for big spatial data processing, and many spatial frameworks was been depending on GeoSpark, such as Apache Sedona which used GeoSpark to activate spatial big data computing

  • From experimental environment setup activation and working with spatial data computing on Apache Spark was more flexible than in Apache ignite, in which that preparing Apache Ignite experimental requirements have taken more time than Apache Spark, in which that spatial data computing on Apache Ignite required working on an IDE environment such as Netbeans, and adding some dependencies in the IDE environment, such as JTS Topology, was in Apache Spark does not depending on any IDE environment and the dependencies had been added on spark-shell in a simple code, and start working with the shell using Scala programming language

Read more

Summary

Introduction

Big data processing has always been a critical research area in both academia and industry. Several big tech organizations invested billions of dollars to build Big data Eco-system, For example, Facebook [1], LinkedIn [2], Microsoft [3], ESRI [4] to name a few. Several non-tech companies have integrated one or more available platforms to scale out and perform their big data analytic tasks. One important domain of this market is building Eco-systems for spatial data due to the plethora of applications and services that create them. Earth observation has continuously provided a significant volume of geospatial data over the last few years, resource tracking [5], environmental protection, and disaster predictions [6]. Big data spatial computing has become extremely valuable with the widespread use of these services and applications

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call