Abstract

Sequential pattern mining (SPM) discovers, from event transactions recorded along time, patterns of events fulfilling a sequential order. In this work, we introduce a new efficient sequential pattern mining algorithm called VEPRECO. VEPRECO proposes three main contributions that fasten the mining process: a vertical representation of patterns, pre-pruning strategies to avoid checking infrequent patterns, and common candidate selection policies that reduce the number of iterations performed by the algorithm.An experimental evaluation was performed with synthetic and real-world datasets, and the results have been compared with the most time and memory-efficient sequential pattern mining algorithm in the literature, the CM-SPAM algorithm, which we have taken as a baseline. We analysed separately how each of the proposed contributions affects time and memory usage and found that the one that reduced the most time and memory was the representation of the proposed patterns. Pre-pruning strategies and common candidate selection policies reduce runtime in datasets with many sequences and similar lengths of transactions and sequences.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call