Efficiently Mining Frequent Itemsets on Massive Data

Xixian Han,Jian Chen,Guojun Lai,Xianmin Liu,Hong Gao,Jianzhong Li

doi:10.1109/access.2019.2902602

Abstract

Frequent itemset mining is an important operation to return all itemsets in the transaction table, which occur as a subset of at least a specified fraction of the transactions. The existing algorithms cannot compute frequent itemsets on massive data efficiently, since they either require multiple-pass scans on the table or construct complex data structures which normally exceed the available memory on massive data. This paper proposes a novel precomputation-based frequent itemset mining (PFIM) algorithm to compute the frequent itemsets quickly on massive data. PFIM treats the transaction table as two parts: the large old table storing historical data and the relatively small new table storing newly generated data. PFIM first pre-constructs the quasi-frequent itemsets on the old table whose supports are above the lower-bound of the practical support level. Given the specified support threshold, PFIM can quickly return the required frequent itemsets on the table by utilizing the quasi-frequent itemsets. Three pruning rules are presented to reduce the size of the involved candidates. An incremental update strategy is devised to efficiently re-construct the quasi-frequent itemsets when the tables are merged. The extensive experimental results, conducted on synthetic and real-life data sets, show that PFIM has a significant advantage over the existing algorithms and runs two orders of magnitude faster than the latest algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 40	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Efficiently Mining Frequent Itemsets on Massive Data

Abstract

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

A new algorithm for fast mining frequent itemsets using N-lists
Zhihong Deng ... Zhonghui Wang
Science China Information Sciences | VOL. 55
Zhihong Deng, et. al.Zhihong Deng ... Zhonghui Wang
19 Jul 2012
Science China Information Sciences | VOL. 55

Stable Periodic Frequent Itemset Mining on Uncertain Datasets
Ruimeng He ... Yuxin Duan
-
Ruimeng He, et. al.Ruimeng He ... Yuxin Duan
13 Aug 2021
13 Aug 2021

Mining Rare Itemset with Automated Support Thresholds
Sadhasivam
Journal of Computer Science | VOL. 7
Sadhasivam Sadhasivam
01 Mar 2011
Journal of Computer Science | VOL. 7

Mining Frequent Itemsets in Large Data Warehouses: A Novel Approach Proposed for Sparse Data Sets
S M Fakhrahmad ... M H Sadreddini
-
S M Fakhrahmad, et. al.S M Fakhrahmad ... M H Sadreddini
16 Dec 2007
16 Dec 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficiently Mining Frequent Itemsets on Massive Data

Abstract

Talk to us

Similar Papers

More From: IEEE Access