A Fast Approach for Up-Scaling Frequent Itemsets

Runzi Chen,Mengmeng Liu,Shuliang Zhao

doi:10.1109/access.2020.2995719

Runzi Chen, Mengmeng Liu + Show 1 more

Open Access

https://doi.org/10.1109/access.2020.2995719

Copy DOI

Abstract

With the rapid growth of data scale and diversification of demand, people have an urgent desire to extract useful frequent itemset from datasets of different scales. It is no doubt that the traditional method can solve the problem. However, the relationships among datasets of different scales are not fully utilized. A fast approach proposed in this paper is as follows: the frequent itemsets on the large-scale data are directly inferred based on the frequent itemsets that are belonged small-scale datasets, instead of mined from the large-scale dataset again on condition that the frequent itemsets on the small-scale datasets have been mined. We conduct extensive experiments on one synthetic data and four UCI data sets. The experimental results show that our algorithm is significantly faster and consumes less memory than these leading algorithms.

Highlights

To analyze customer’s buying behavior-based transactions database, Agrawal et al first presented frequent itemset mining in 1993 [1], that is one of the critical data mining tasks and has widely used in many other significant data mining tasks including mining associations and correlations, classifying, clustering, etc
The contributions of this paper are listed as follows: 1) This paper presents a novel framework for addressing the issue that one mines frequent itemsets from different scale datasets
We introduce the method(up-scaling) that computes frequent itemsets of the large-scale dataset depending on the frequent itemsets which belonged to small-scale datasets, not original data

Summary

Introduction

To analyze customer’s buying behavior-based transactions database, Agrawal et al first presented frequent itemset mining in 1993 [1], that is one of the critical data mining tasks and has widely used in many other significant data mining tasks including mining associations and correlations, classifying, clustering, etc. After Apriori proposed, there are several improved algorithms because Apriori needs to scan the database repeatedly. These algorithms have a common feature: generating candidate itemsets. FP-growth algorithm is a classic representative that does not generate candidate itemsets and compresses the database representing frequent items into FP-tree, which retains the itemset association information [2]. To enhance the efficiency of mining frequent itemset, three kinds of the data structure are presented by Deng et al, named Node-list, N-list, and Nod-eset. Despite the above advantage of Nodeset, two data structures (DiffNodeset [3] and NegNodeset [4]) are proposed by Deng et al and Aryabarzan et al, and there are two algorithms named dFIN and negFIN based the former

Objectives

Methods

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 23	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Fast Approach for Up-Scaling Frequent Itemsets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

An incremental algorithm for frequent itemset mining on spark
Min Yu ... Yunpeng Yuan
-
Min Yu, et. al.Min Yu ... Yunpeng Yuan
01 Mar 2017
01 Mar 2017

A New Malware Classification Approach Based on Malware Dynamic Analysis
Ying Fang ... Bo Yu
-
Ying Fang, et. al.Ying Fang ... Bo Yu
01 Jan 2017
01 Jan 2017

Analysis And ImplementationOf K-Mean And K-Medoids Algorithm For Large Dataset To Increase Scalability And Efficiency
...
-
, et. al. ...
01 Jan 2015
01 Jan 2015

Large-scale Agent Data Partitioning Based on DensityRepel-K_medoids
Lingjuan Wu ... Jie Liang
Journal of Physics: Conference Series | VOL. 1284
Lingjuan Wu, et. al.Lingjuan Wu ... Jie Liang
01 Aug 2019
Journal of Physics: Conference Series | VOL. 1284

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Fast Approach for Up-Scaling Frequent Itemsets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access