Abstract

Due to the massive data size and complexness, big data mining using a sole computer is a problematic task. With the rapid increase in the database size, parallel and distributed computing systems can yield better benefits in the data mining applications. Parallelization of the Association Rule Mining (ARM) algorithms is a significant task in the data mining application for effectively mining the frequent itemsets from the large-size databases. These mining algorithms allocate the database in a horizontal manner or increase the number of processors to decrease the overall time necessary for mining the frequent itemsets. In this paper, a combined Horizontal Parallel-Apriori (HP-Apriori) and Adaptive Frequent Pattern (FP) Growth algorithm is proposed to divide the database both horizontally and vertically into four sub-processes, for parallel processing of all four tasks. The Horizontal Parallel-Apriori algorithm increases the speed of the mining process using an index file. Adaptive Binomial Distribution (ABD) is applied to the Frequent Pattern Growth Algorithm to find the minimum support for mining the optimal frequent itemsets. Experimental analysis established that the combined algorithm outperforms in terms of minimizing the overall execution time and increasing the computational speed in high scalability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call