Abstract

Association rule mining is an important research topic in data mining. Association rule mining consists of two steps: finding frequent itemsets and then extracting interesting rules from the frequent itemsets. In the first step, efficiency is important since discovering frequent itemsets is computationally time consuming. In the second step, unbiased assessment is important for good decision making. In this paper, we deal with both the efficiency of the mining algorithm and the measure of interest of the resulting rules. First, we present an algorithm for finding frequent itemsets that uses a vertical database. We also introduce a modified vertical data format to reduce the size of the database and an itemset reordering strategy to reduce the size of the intermediate tidsets. Second, we present a new measure to evaluate the interest of the resulting association rules. Our performance analysis shows that our proposed algorithm reduces the size of the intermediate tidsets that are generated during the mining process. The smaller tidsets make intersection operations faster. Using our interest-measuring test helps to avoid the discovery of misleading rules.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call