Abstract
In real-world networks, an edge (or vertex) may be associated with a transaction database, where each transaction consists of a set of items to describe the attributes or behaviors of the edge (or vertex). This type of graph is named as a transactional graph, and the problem of mining sequential patterns from transactional graphs is to discover the subsequences of transaction sequences that frequently appear in many paths of the graph. Solving this problem can facilitate broad applications ranging from social network analysis to urban computing. However, this task is not trivial due to a large number of sequences existed in the graph. In particular, each path of the graph may induce multiple sequences of transactions, and gathering the transaction sequences induced by all the paths may lead to an extremely large volume. To efficiently approximate the top-k patterns, we propose a Parallelized Sampling-based Approach For Mining Top-k Sequential Patterns, PSMSP, which involves two key techniques: (a) a parallelized unbiased sequence sampling approach based on a sequence balanced graph partition strategy, and (b) a novel PSP-Tree structure to efficiently mine the patterns based on the anti-monotonicity properties. The experimental results on synthetic and real-world datasets demonstrate that the PSMSP can successfully find the sequential patterns with superior efficiency.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.