Abstract

Sequential pattern mining (SPM) discovers, from event transactions recorded along time, patterns of events fulfilling a sequential order. In this work, we introduce a new efficient sequential pattern mining algorithm called VEPRECO. VEPRECO proposes three main contributions that fasten the mining process: a vertical representation of patterns, pre-pruning strategies to avoid checking infrequent patterns, and common candidate selection policies that reduce the number of iterations performed by the algorithm.An experimental evaluation was performed with synthetic and real-world datasets, and the results have been compared with the most time and memory-efficient sequential pattern mining algorithm in the literature, the CM-SPAM algorithm, which we have taken as a baseline. We analysed separately how each of the proposed contributions affects time and memory usage and found that the one that reduced the most time and memory was the representation of the proposed patterns. Pre-pruning strategies and common candidate selection policies reduce runtime in datasets with many sequences and similar lengths of transactions and sequences.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.