Abstract
Generic sequential pattern mining problem aims to mine the set of sequential patterns from a sequential database that satisfies a minimum support or occurrence threshold constraint. The main challenges that affect the efficiency of a solution lie in reducing the pattern search space, early detecting the infrequent patterns, representing the database in an efficient format, etc. Also, additional challenges get included when the problem environment transitions from static to incremental database leading to not to re-mine but efficiently tracking the effect of the incremental portion over the complete updated database. In this article, we introduce a new tree-based solution to the sequential pattern mining problem, including two sets of novel solutions for static and incremental sequential databases. We propose two new structures, SP-Tree and IncSP-Tree, and design two efficient algorithms, Tree-Miner and IncTree-Miner to mine the complete set of sequential patterns from static and incremental databases respectively. The proposed novel structures provide an efficient manner to store the complete sequential database maintaining “build-once-mine-many” property and giving scope to perform interactive mining. Additionally, we also design a new breath-first based support counting technique to efficiently identify the infrequent patterns at early stages and a new heuristic pruning strategy to reduce pattern search space. We also design a new pattern storage structure BPFSP-Tree to store the frequent patterns during successive iterations in incremental mining to reduce the number of database scans and to remove the infrequent patterns efficiently. A novel structure named Sequence Summarizer is also introduced to efficiently calculate and update the co-occurrence information of the items, especially in an incremental environment. Experimental results from various real-life and synthetic datasets demonstrate the efficiency of our work in comparison with the related state-of-the-art approaches.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.