Abstract
With the rapid development of mobile data acquisition technology, the volume of available spatial data is growing at an increasingly fast pace. The real-time processing of big spatial data has become a research frontier in the field of Geographic Information Systems (GIS). To cope with these highly dynamic data, we aim to reduce the time complexity of data updating by modifying the traditional spatial index. However, existing algorithms and data structures are based on single work nodes, which are incapable of handling the required high numbers and update rates of moving objects. In this paper, we present a distributed spatial index based on Apache Storm, an open-source distributed real-time computation system. Using this approach, we compare the range and K-nearest neighbor (KNN) query efficiency of four spatial indexes on a single dataset and introduce a method of performing spatial joins between two moving datasets. In particular, we build a secondary distributed index for spatial join queries based on the grid-partition index. Finally, a series of experiments are presented to explore the factors that affect the performance of the distributed index and to demonstrate the feasibility of the proposed distributed index based on Storm. As a real-world application, this approach has been integrated into an information system that provides real-time traffic decision support.
Highlights
Advanced technologies for sensing and computing have resulted in the creation of massive datasets consisting of trajectories of people and vehicles
We present a distributed spatial index based on Apache Storm, an open-source distributed real-time computation system
You et al proposed an approach using a distributed computing framework based on memory to address the problem of spatial join operations, because the intermediate calculation results of Hadoop must be stored on hard disk, which reduces analytical efficiency because of the excessive disk I/O cost [3]
Summary
Advanced technologies for sensing and computing have resulted in the creation of massive datasets consisting of trajectories of people and vehicles. You et al proposed an approach using a distributed computing framework based on memory to address the problem of spatial join operations, because the intermediate calculation results of Hadoop must be stored on hard disk, which reduces analytical efficiency because of the excessive disk I/O cost [3] These cloud technologies offer the benefits of high-performance computing for GIS applications based on static SBD, a novel approach to cloud computing is needed to cope with highly dynamic spatial data. To lay the foundations for effective visual analytics of trajectory datasets, this manuscript presents a method of building a distributed spatial index on an open-source cloud computing framework, namely, Apache Storm, which has been playing an important role in traffic statistics [5] and network anomaly detection [6] Based on this technology, we conduct real-time spatial querying of spatial fast data (SFD), including range querying, KNN querying, and spatial join querying.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have