Abstract

Large-scale data processing is one of the focal points of research in information technology. The traditional al- gorithm of association rule is in a large overhead, due to the frequent itemsets being computed on the dataset. The rapid development of distributed technology makes cloud computing a reality in the implementation of data processing algo- rithms. To improve the traditional association rule algorithm, in this paper, an AprioriMR algorithm for mining associa- tion rule based on cloud computing is proposed. The AprioriMR algorithm takes HDFS to store data and is well adapted to the Hadoop's Map-Reduce computing model. It divides into two parts, deals with Map-Reduce operation, and combines to produce the frequent patterns. The AprioriMR algorithm inherits the Map-Reduce scalability to huge datasets and to thousands of processing nodes. Experimental results show that it is very efficiently compared with the traditional associa- tion rules algorithm and has a good speedup when deals with massive data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call