Abstract

Pattern discovery is the important part of knowledge discovery in Database, comes under Data mining. To discover useful patterns, association rule mining is one of the most popularized and revealing technique in data mining. Association rule mining plays a key role in decision making by discovering useful relations between attributes in the database. For this, first Frequent itemsets need to calculate followed by Candidate itemset. While generating frequent itemsets, frequent-1 itemset can be generated easily. But frequent 2-itemsets suffered from both time and space complexity. More overhead and space complexity occurred in a generation of frequent 2-itemset is the issue of this paper. For more I/O throughput it is essential to generate frequent itemsets as fast as possible and space eificient. To possible this, intermediate data generated by pairing each item with another item in itemset needs access of random read/write. To access random data for low latency, Apache HBase is the solution. Based on performed results, it is shown that if dataset stored through HBase on HDFS, space and time complexity can be achieved better with Apriori MapReduce algorithm for finding association rules.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call