ISSP-tree: An improved fast algorithm for constructing a complete prefix tree using single database scan

Shafiul Alom Ahmed,Bhabesh Nath

doi:10.1016/j.eswa.2021.115603

Abstract

The researchers have explored the frequent pattern mining problem by considering the fact that the complete set of information to be processed can be accommodated in systems main memory, and databases are static. However, any transactional or online database may get modified in real-life scenarios due to new transactions or deleting previous obsolete records. Moreover, the support threshold may get updated over time to generate a new set of frequent patterns from the updated database. An inefficient but straightforward method to deal with this problem is recomputing the fresh set of patterns for the updated database or updated support threshold. Most of the existing algorithms perform pattern mining using multiple database scans, which requires a massive amount of main memory and computational time to retain tedious candidate itemsets and prune out the unnecessary itemsets. The research community has developed a few methods to handle the incremental scenario without re-computation from scratch, and those methods are efficient in terms of database scan point of view. Although the approaches have solved the re-computation problem by constructing a complete pattern-tree data structure using only one database scan, they have significant issues such as massive disk I/O and colossal search space high tree construction time. Therefore, to improve the tree construction time, we propose an efficient tree data structure called ISSP-tree (Improved Single Scan Pattern Tree), which creates a complete tree to retain all the database transactions irrespective of the item frequencies using only one database scan. Moreover, the method is also adaptive to incremental and interactive mining.

Full Text