Improved algorithm for parallel mining collaborative frequent itemsets in multiple data streams

Fang’Ai Liu,Qianqian Wang,Xin Wang

doi:10.1007/s10586-018-1859-y

Abstract

With the rapid development of the World Wide Web technology, complex and diverse data present explosive growth, so frequent itemset mining plays an essential role. In view of the mining frequent itemsets in multiple data streams by limited computing power of a single processor, an improved algorithm of Parallel Mining Collaborative frequent itemsets in multiple data streams (PMCMD-Stream) was proposed. Firstly, the algorithm compresses the potential and frequent itemsets into CP-Tree only by one-scan and applies increment method to inserting or deleting related branch on CP-Tree, we do not need to repeatedly scanning the databases to generate many candidate frequent itemsets and save the running time. Secondly, this parallelized algorithm can be run in the MapReduce programming environment. Finally, the valuable frequent itemsets, namely global collaborative frequent itemsets, were obtained. Because each candidate frequent itemset is independent, and different candidate frequent itemsets can be processed by multiple computing machines concurrently. The experimental results show that PMCMD-Stream algorithm not only can improve the mining efficiency but also have much better scalability than the existing algorithms, so as to discover the collaborative frequent itemsets from large-scale data streams.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improved algorithm for parallel mining collaborative frequent itemsets in multiple data streams

Abstract

Talk to us

Similar Papers

More From: Cluster Computing

Lead the way for us

Journal: Cluster Computing	Publication Date: Jan 30, 2018
Citations: 6

Similar Papers

Research on Topology Association Rules Algorithm Based on Spatial Constraints
Xiao Bin Tang ... Yi Zhi Zhang
Advanced Materials Research | VOL. 998-999
Xiao Bin Tang, et. al.Xiao Bin Tang ... Yi Zhi Zhang
01 Jul 2014
Advanced Materials Research | VOL. 998-999

A frequent itemset reduction algorithm for global pattern mining on distributed data streams
Shalini ... Sanjay Kumar Jain
-
Shalini, et. al. Shalini ... Sanjay Kumar Jain
01 Aug 2017
01 Aug 2017

Correlating synchronous and asynchronous data streams
Sudipto Guha ... D Gunopulos
-
Sudipto Guha, et. al.Sudipto Guha ... D Gunopulos
24 Aug 2003
24 Aug 2003

CL-MAX: a clustering-based approximation algorithm for mining maximal frequent itemsets
Seyed Mohsen Fatemi ... Ali Kamandi
International Journal of Machine Learning and Cybernetics | VOL. 12
Seyed Mohsen Fatemi, et. al.Seyed Mohsen Fatemi ... Ali Kamandi
10 Aug 2020
International Journal of Machine Learning and Cybernetics | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improved algorithm for parallel mining collaborative frequent itemsets in multiple data streams

Abstract

Talk to us

Similar Papers

More From: Cluster Computing