An efficient approach based on selective partitioning for maximal frequent itemsets mining

Anita Bai,Meera Dhabu,Parag S Deshpande,Viraj Jagtap

doi:10.1007/s12046-019-1158-1

Abstract

We present a maximal frequent itemset (MFI) mining algorithm based on selective partitioning called SelPMiner. It makes use of a novel data format named Itemset-count tree—a compact and optimized representation in the form of partition that reduces memory requirement. It also does selective partitioning of the database, which reduces runtime to scan database. As the algorithm progressively searches for longer frequent itemsets in a depth-first manner, it creates new partitions with even smaller sizes having less dimensions and unique data instances, which results in faster support counting. SelPMiner uses a number of optimizations to prune the search space. We also prove upper bounds on the amount of memory consumed by these partitions. Experimental comparisons of the SelPMiner algorithm with popular existing fastest MFI mining algorithms on different types of datasets show significant speedup in computation time for many cases. SelPMiner works especially well when the minimum support is low and consumes less memory.

Full Text