Sequence-Growth: A Scalable and Effective Frequent Itemset Mining Algorithm for Big Data Based on MapReduce Framework

Yen-Hui Liang,Shiow-Yang Wu

doi:10.1109/bigdatacongress.2015.65

Abstract

Frequent item set mining(FIM) is an important research topic because it is widely applied in real world to find the frequent item sets and to mine human behavior patterns. FIM process is both memory and compute-intensive. As data grows exponentially every day, the problems of efficiency and scalability become more severe. In this paper, we propose a new distributed FIM algorithm, called Sequence-Growth, and implement it on MapReduce framework. Our algorithm applies the idea of order to construct a called lexicographical sequence tree, that allows us to find all frequent item sets without exhaustive search over the transaction databases. In addition, the breadth-wide support-based pruning strategy is also an important factor to contribute the efficiency and scalability of our algorithm. To test the performances of our algorithm, we conduct varied aspects of experiments on MapReduce framework with large datasets. The results show the good efficiency and scalability of Sequence-Growth especially to deal with big data and long item sets. Our algorithm also proposes a new mining methodology which can be easily modified for sequential pattern mining, trajectory pattern mining and other associate rule mining algorithms. We believe that it should have a valuable contribution in the future development of association rule mining algorithms for big data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Sequence-Growth: A Scalable and Effective Frequent Itemset Mining Algorithm for Big Data Based on MapReduce Framework

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A false negative approach to mining frequent itemsets from high speed transactional data streams
Jeffrey Xu Yu ... Aoying Zhou
Information Sciences | VOL. 176
Jeffrey Xu Yu, et. al.Jeffrey Xu Yu ... Aoying Zhou
29 Nov 2005
Information Sciences | VOL. 176

A Novel Nodesets-Based Frequent Itemset Mining Algorithm for Big Data using MapReduce
Borra Sivaiah ... Ramisetty Rajeswara Rao
International journal of electrical and computer engineering systems | VOL. 14
Borra Sivaiah, et. al.Borra Sivaiah ... Ramisetty Rajeswara Rao
14 Nov 2023
International journal of electrical and computer engineering systems | VOL. 14

Mining Frequent Item and Item Sets Using Fuzzy Slices
Ms Poonam A Manjare ... Mrs R.R Shelke
international journal of engineering trends and technology | VOL. -
Ms Poonam A Manjare, et. al.Ms Poonam A Manjare ... Mrs R.R Shelke
25 Mar 2014
international journal of engineering trends and technology | VOL. -

Mining Frequent Itemsets in Large Data Warehouses: A Novel Approach Proposed for Sparse Data Sets
S M Fakhrahmad ... M H Sadreddini
-
S M Fakhrahmad, et. al.S M Fakhrahmad ... M H Sadreddini
16 Dec 2007
16 Dec 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sequence-Growth: A Scalable and Effective Frequent Itemset Mining Algorithm for Big Data Based on MapReduce Framework

Abstract

Talk to us

Similar Papers