Abstract

With the rapid development of mobile devices and sensors, effective searching methods for big spatial data have recently received a significant amount of attention. Owing to their large size, many applications typically store recently generated spatial data in NoSQL databases such as HBase. As the index of HBase only supports a one-dimensional row keys, the spatial data is commonly enumerated using linearization techniques. However, the linearization techniques cannot completely guarantee the spatial proximity of data. Therefore, several studies have attempted to reduce false positives in spatial query processing by implementing a multi-dimensional indexing layer. In this paper, we propose a hierarchical indexing structure called a quadrant-based minimum bounding rectangle (QbMBR) tree for effective spatial query processing in HBase. In our method, spatial objects are grouped more precisely by using QbMBR and are indexed based on QbMBR. The QbMBR tree not only provides more selective query processing, but also reduces the storage space required for indexing. Based on the QbMBR tree index, two query-processing algorithms for range query and kNN query are also proposed in this paper. The algorithms significantly reduce query execution times by prefetching the necessary index nodes into memory while traversing the QbMBR tree. Experimental analysis demonstrates that our method significantly outperforms existing methods.

Highlights

  • Due to the development of data acquisition techniques, the main sources of modern spatial data have become sensors, the Internet of Things (IoT), social media, and mobile phones

  • As a result of these environmental changes, handling large-sized geo-tagged data has become one of the important issues of the current data management systems. Traditional spatial data such as satellite imagery, road networks, and raster images have been classified as large data, the sizes of the geo-spatial datasets generated in the Internet of Everything (IoE) environment exceed the capacity of the current computing systems [1]

  • We present analyses for the storage overhead of our quadrant-based minimum bounding rectangle (QbMBR) tree

Read more

Summary

Introduction

Due to the development of data acquisition techniques, the main sources of modern spatial data have become sensors, the Internet of Things (IoT), social media, and mobile phones. In order to efficiently store and search big spatial data, many studies have attempted to use the Hadoop distributed file system and the MapReduce technique [3,4,5] These methods suffer from a large number of disk I/O operations during spatial query processing because of a lack of spatial awareness of the underlying file systems. Unrelated regions because they roughly partition the data using fixed sized grid cells To solve this problem, we previously proposed an indexing method [14] using quadrant-based. Since our index dynamically partitions data into a smaller unit, it can support more selective query processing with less storage overhead. QbMBR tree indexing improves query processing performance under various conditions, and requires less storage overhead than other methods.

Related Works
Data Partitioning with Quadrant-Based MBR
Range Query Algorithm for a QbMBR Tree
Example
Experimental Setup and Datasets
Effect of Query Radius
Effect of Database Size
Findings
Effect of the Number of Nearest Neighbors
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call