Abstract

Frequent pattern (itemset) mining is one of the established approaches for knowledge discovery. Minimizing the number of database scans (I/O overhead) is a challenging task in Frequent itemset mining. Partition algorithm is one of the early novel approaches to reduce the database I/O overhead as compared to Apriori algorithm and other related methods. However, Partition algorithm suffers from a significant database I/O overhead (that is, it reads the database twice from the secondary storage) and higher time complexity for computation of frequent itemsets in large databases. In this work, an improved partition algorithm is proposed, which reads the database only once and makes use of local support information to avoid further scans of the database. The proposed algorithm outperforms Apriori and Partition algorithms and shows closer performance to FP-Growth algorithm, in terms of computational time. The proposed method outpaces FP-Growth algorithm in terms of memory usage and is competitive to other algorithms. In terms of database access time, the proposed method exhibits better performance over FP-Growth, Partition and Apriori methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call