Abstract

Sequential pattern mining is an important and useful tool with broad applications, such as analyzing customer purchase behavior, recommending services to customers, and so on. It is challenging since explosive number of subsequences need to be examined and both the memory and computational cost are becoming extremely expensive when the sequence database grows huge. Many previous algorithms developed for efficient mining of sequential patterns encounter problems to deal with large scale data. In this paper, we propose a parallel sequential pattern mining method, called PTDS (i.e., Parallel Transaction-Decomposed Sequential pattern mining), which decomposes transactions to mine sequential patterns. PTDS greatly accelerates pattern growth and improves the efficiency of parallel algorithm on large scale data. We experiment on a large dataset consisting of 16 million service purchase sequences. Besides scalability, the empirical comparisons show that PTDS consistently outperforms both the PrefixSpan-based parallel method and serial algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call