Abstract

Frequent itemset mining (FIM) is the crucial task in mining association rules that finds all frequent k-itemsets in the transaction dataset from which all association rules are extracted. In the big-data era, the datasets are huge and rapidly expanding, so adding new transactions as time advances results in periodic changes in correlations and frequent itemsets present in the dataset. Re-mining the updated dataset is impractical and costly. This problem is solved via incremental frequent itemset mining. Numerous researchers view the new transactions as a distinct dataset (partition) that may be mined to obtain all of its frequent item sets. The extracted local frequent itemsets are then combined to create a collection of global candidates, where it is possible to estimate the support count of the combined candidates to avoid re-scanning the dataset. However, these works are hampered by the growth of a huge number of candidates, and the support count estimation is still imprecise. In this paper, the Closed Candidates-based Incremental Frequent Itemset Mining approach, or CC-IFIM, has been proposed to decrease candidate generation and improve the accuracy of the global frequent itemsets that are retrieved. The proposed approach is able to prune several produced candidates in earlier steps before performing any further computations. To improve the accuracy of the computation of the support count of the produced candidates, the similarity between partitions has been evaluated using just the local closed candidates rather than all candidates. The experimental findings demonstrated that the CC-IFIM approach is superior to its competitors in terms of efficiency and accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call