Abstract

In recent years positioning sensors have become ubiquitous, and there has been tremendous growth in the amount of trajectory data. It is a huge challenge to efficiently store and query massive trajectory data. Among the typical operation over trajectories, similarity query is an important yet complicated operator. It is useful in navigation systems, transportation optimizations, and so on. However, most existing studies have focused on handling the problem on a centralized system, while with a single machine it is difficult to satisfy the storage and processing requirements of mass data. A distributed framework for the similarity query of massive trajectory data is urgently needed. In this research, we propose DFTHR (distributed framework based on HBase and Redis) to support the similarity query using Hausdorff distance. DFTHR utilizes a segment-based data model with a number of optimizations for storing, indexing and pruning to ensure efficient querying capability. Furthermore, it adopts a bulk-based method to alleviate the cost for adjusting partitions, so that the incremental dataset can be efficiently supported. Additionally, DFTHR introduces a co-location-based distributed strategy and a node-locality-based parallel query algorithm to reduce the inter-worker cost overhead. Experiments show that DFTHR significantly outperforms other schemes.

Highlights

  • With the rapid development of mobile networks and position technologies, trajectory data of moving object (MO) can be collected more and accurately

  • In light of that, using Hausdorff distance [13] as the distance function, we propose DFTHR, a distributed framework for trajectory similarity query based on HBase [14] and Redis [15]

  • It is expensive to compute the similarity between two trajectories directly, so according to the data model of the trajectory segment, we propose a lower bound based pruning method and a MBR

Read more

Summary

Introduction

With the rapid development of mobile networks and position technologies, trajectory data of moving object (MO) can be collected more and accurately. In order to process the similarity query problem of large-scale trajectory data, some distributed framework based schemes [9,10,11,12] are proposed according to the selected distance functions. We proposes a bulk-based partitioning model to deploy trajectory data and indexes in a distributed environment, and on this basis implement the mechanism of cost optimization to efficiently process the state change of each partitions so that our scheme can effectively support the incremental datasets. We devise certain maintenance strategies to ensure co-location between each partition and its corresponding index, and implement node-locality-based parallel query algorithms to reduce the data transmission overhead when querying. We introduce a bulk-based partitioning model and certain optimization strategies to alleviate the cost of adjusting data distribution, so that an incremental dataset can be supported efficiently.

Related Work
Problem Formulation
Overview
T-Table
Trajectory Segment-Based Storage Model
Bulk-Based Partitioning Model
T-table
R-Index
Spatio-temporal
Rowkey
Trajectory Segment-Based Pruning
TGiven trajectories
Threshold-Based Query
Maintenance Module
Set Up
Performance of Data Insertion
Performance of Threshold-Based Query
Performance of k-NN Query
11. Performance
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.