Frequent Itemset Mining Using Improved Apriori Algorithm with MapReduce

Seema A Tribhuvan,Nitin R Gavai,Bharti P Vasgi

doi:10.1109/iccubea.2017.8463915

Abstract

Data mining tools estimate future trends and behaviors, allowing businesses to make practical, knowledge-driven decisions. Association Rule mining is a very important data mining practice in different fields. In most of the data mining applications, finding frequent itemsets is the crucial issue needed to be addressed. Several algorithms like Apriori, FP-Growth, and FUIT offered better solutions for mining frequent itemsets. Still, the features like automatic parallelization, fine load balancing, and distribution of data on large clusters, are needed to be achieved. Improved Apriori Algorithm with MapReduce framework is used to solve these issues. Thus, it is possible to achieve parallelism and lessen the execution time. Here, number of mapreduce jobs are used to discover frequent itemsets of big datasets with the help of multiple computing nodes by applying parallelism among them. In this paper, the proposed system works on multiple nodes efficiently and execution time also reduced.

Full Text