Abstract

Association is a technique in data mining used to identify the relationship between itemsets in a database (association rule). Some researches in association rule since the invention of AIS algorithm in 1993 have yielded several new algorithms. Some of those used artificial datasets (IBM) and claimed by the authors to have a reliable performance in finding maximal frequent itemset. But these datasets have a different characteristics from real world dataset. The goal of this research is to compare the performance of Apriori and Cut Both Ways (CBW) algorithms using 3 real world datasets. We used small and large values of minimum support thresholds as atreatment for each algorithm and datasets. As a result we find that the characteristics of datasets have a signifcant effect on the performance of Apriori and CBW. Support counting strategy, horizontal counting, showed a better performance compared to vertical intersection although candidate frequent itemsets counted was fewer.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call