Parallel Sequential Pattern Mining of Massive Trajectory Data

Shaojie Qiao,Jing Peng,Tianrui Li,Jiangtao Qiu

doi:10.1080/18756891.2010.9727705

Abstract

The trajectory pattern mining problem has recently attracted much attention due to the rapid development of location-acquisition technologies, and parallel computing essentially provides an alternative method for handling this problem. This study precisely addresses the problem of parallel mining of trajectory sequential patterns based on the newly proposed concepts with regard to trajectory pattern mining. We propose an efficient and effective parallel sequential patterns mining (plute) algorithm that includes three essential techniques: prefix projection, data parallel formulation, and task parallel formulation. Firstly, the prefix projection technique is used to decompose the search space as well as greatly reduce the candidate trajectory sequences. Secondly, the data parallel formulation decomposes the computations associated with counting the support of trajectory patterns. Thirdly, the task parallel formulation employs the MapReduce programming model to assign the computations across a set of machines in a scalable and easy-to-use fashion. Based on the properties of parallel trajectory sequences, item pruning and sequence pruning strategies are applied to further prune the candidate sequences. Extensive experiments are conducted to evaluate the performance of plute in terms of parallel computing time and communication cost among processors. Experimental results show that plute outperforms the previously proposed parallel mining strategy (PartSpan) in mining massive trajectory data.

Full Text