Abstract

Current social-network-based and location-based-service applications need to handle continuous spatial approximate keyword queries over geo-textual streaming data of high density. The continuous query is a well-known expensive operation. The optimization of continuous query processing is still an open issue. For geo-textual streaming data, the performance issue is more serious since both location information and textual description need to be matched for each incoming streaming data tuple. The state-of-the-art continuous spatial-keyword query indexing approaches generally lack both support for approximate keyword matching and high-performance processing for geo-textual streaming data. Aiming to tackle this problem, this paper first proposes an indexing approach for efficient supporting of continuous spatial approximate keyword queries by integrating m i n - w i s e signatures into an AP-tree, namely AP-tree + . AP-tree + utilizes the one-permutation m i n - w i s e hashing method to achieve a much lower signature maintenance costs compared with the traditional m i n - w i s e hashing method because it only employs one hashing function instead of dozens. Towards providing a more efficient indexing approach, this paper has explored the feasibility of parallelizing AP-tree + by employing a Graphic Processing Unit (GPU). We mapped the AP-tree + data structure into the GPU’s memory with a variety of one-dimensional arrays to form the GPU-aided AP-tree + . Furthermore, a m i n - w i s e parallel hashing algorithm with a scheme of data parallel and a GPU-CPU data communication method based on a four-stage pipeline way have been used to optimize the performance of the GPU-aided AP-tree + . The experimental results indicate that (1) AP-tree + can reduce the space cost by about 11% compared with MHR-tree, (2) AP-tree + can hold a comparable recall and 5.64× query performance gain compared with MHR-tree while saving 41.66% maintenance cost on average, (3) the GPU-aided AP-tree + can attain an average speedup of 5.76× compared to AP-tree + , and (4) the GPU-CPU data communication scheme can further improve the query performance of the GPU-aided AP-tree + by 39.4%.

Highlights

  • Traditional Geographic Information Systems (GIS) are well-adapted for offline algorithms over static data

  • The main contributions of this study are as follows: 1. We have introduced an advanced AP-tree indexing method called AP-tree+ to support continuous spatial approximate keyword queries with efficiently embedding minimum-wise signatures into the AP-tree structure based on one-permutation min-wise hashing [15]

  • Existing spatial-keyword query indexing approaches generally do not apply for approximate keyword matching

Read more

Summary

Introduction

Traditional Geographic Information Systems (GIS) are well-adapted for offline algorithms over static data. A GIS application is expected to have complete information about the input static data to be processed [1]. With the proliferation of Global Navigation Satellite System (GNSS)-equipped devices and wireless sensor networks, current GIS applications (e.g., location-based services and webGIS) need to be much more suited to online processing to deal with large volumes of highly dynamic geo-streaming data. In [3], a spatio-temporal query language is proposed to process semantic geo-streaming data. Moby Dick [4,5], which is a distributed framework for GeoStreams, has been developed towards efficient real-time managing and monitoring of mobile objects through distributed geo-streaming data processing on large clusters. A more comprehensive introduction about processing GeoStreams is available in [6]

Objectives
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call