A Novel Nodesets-Based Frequent Itemset Mining Algorithm for Big Data using MapReduce

Borra Sivaiah,Ramisetty Rajeswara Rao

doi:10.32985/ijeces.14.9.9

Abstract

Due to the rapid growth of data from different sources in organizations, the traditional tools and techniques that cannot handle such huge data are known as big data which is in a scalable fashion. Similarly, many existing frequent itemset mining algorithms have good performance but scalability problems as they cannot exploit parallel processing power available locally or in cloud infrastructure. Since big data and cloud ecosystem overcomes the barriers or limitations in computing resources, it is a natural choice to use distributed programming paradigms such as Map Reduce. In this paper, we propose a novel algorithm known as A Nodesets-based Fast and Scalable Frequent Itemset Mining (FSFIM) to extract frequent itemsets from Big Data. Here, Pre-Order Coding (POC) tree is used to represent data and improve speed in processing. Nodeset is the underlying data structure that is efficient in discovering frequent itemsets. FSFIM is found to be faster and more scalable in mining frequent itemsets. When compared with its predecessors such as Node-lists and N-lists, the Nodesets save half of the memory as they need only either pre-order or post-order coding. Cloudera's Distribution of Hadoop (CDH), a MapReduce framework, is used for empirical study. A prototype application is built to evaluate the performance of the FSFIM. Experimental results revealed that FSFIM outperforms existing algorithms such as Mahout PFP, Mlib PFP, and Big FIM. FSFIM is more scalable and found to be an ideal candidate for real-time applications that mine frequent itemsets from Big Data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Novel Nodesets-Based Frequent Itemset Mining Algorithm for Big Data using MapReduce

Abstract

Talk to us

Similar Papers

More From: International journal of electrical and computer engineering systems

Lead the way for us

Journal: International journal of electrical and computer engineering systems	Publication Date: Nov 14, 2023
License type: cc-by-nc-nd

Similar Papers

Sequence-Growth: A Scalable and Effective Frequent Itemset Mining Algorithm for Big Data Based on MapReduce Framework
Yen-Hui Liang ... Shiow-Yang Wu
-
Yen-Hui Liang, et. al.Yen-Hui Liang ... Shiow-Yang Wu
01 Jun 2015
01 Jun 2015

Mining Frequent Item and Item Sets Using Fuzzy Slices
Ms Poonam A Manjare ... Mrs R.R Shelke
international journal of engineering trends and technology | VOL. -
Ms Poonam A Manjare, et. al.Ms Poonam A Manjare ... Mrs R.R Shelke
25 Mar 2014
international journal of engineering trends and technology | VOL. -

Compressed Bitmaps Based Frequent Itemsets Mining on Hadoop
Aref A Saeed ... Saeed Mahfooz
-
Aref A Saeed, et. al.Aref A Saeed ... Saeed Mahfooz
09 May 2016
09 May 2016

A false negative approach to mining frequent itemsets from high speed transactional data streams
Jeffrey Xu Yu ... Aoying Zhou
Information Sciences | VOL. 176
Jeffrey Xu Yu, et. al.Jeffrey Xu Yu ... Aoying Zhou
29 Nov 2005
Information Sciences | VOL. 176

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Nodesets-Based Frequent Itemset Mining Algorithm for Big Data using MapReduce

Abstract

Talk to us

Similar Papers

More From: International journal of electrical and computer engineering systems