Abstract

Association Rules Mining is one of the most studied and widely applied fields in Data Mining. However, the discovered models usually result in a very large set of rules; so the analysis capability, from the user point of view, is diminishing. Hence, it is difficult to use the found model in order to assist decision-making process. The previous handicap is heightened in presence of redundant rules in the final set. In this work a new definition of redundancy in association rules is proposed, based on user prior knowledge. A post-processing method is developed to eliminate this kind of redundancy, using association rules known by the user. Our proposal allows to find more compact models of association rules to ease its use in the decision-making process. The developed experiments have shown reduction levels that exceed 90 percent of all generated rules, using prior knowledge always below ten percent. So, our method improves the efficiency of association rules mining and the exploitation of discovered association rules.

Highlights

  • Mining for association rules has been one of the most studied fields in data mining

  • Research community accepts the semantical definition of association rule redundancy given in [22] “an association rule is redundant if it conveys the same information - or less general information - than the information conveyed by another rule of the same usefulness and the same relevance”

  • Prior knowledge consists of 6 rules for each dataset

Read more

Summary

Introduction

Mining for association rules has been one of the most studied fields in data mining. Its main goal is to find unknown relations among items in a database. An association rule is presented as an implication X → Y where X is the antecedent and Y is the consequent of the rule Both X and Y are itemsets and usually, but not necessarily, they check X ∩ Y = ∅ property. The problem with association rule mining deals with finding all the rules that satisfy a user-given threshold for support and confidence. 2. Generate all association rules X → (Y − X), considering: Y is a frequent itemset, X ⊂ Y , and conf (X → Y ) is equal or greater than the confidence threshold value. This paper proposes a new approach to deal with redundancy, taking into account user previous knowledge about the studied domain.

Related work
Rule redundancy reduction
Post-processing
Knowledge based redundancy
Methodology
Complexity analysis
Results and discussion
Knowledge vs knowledge based reduction
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call