Distance-Based Methods for Association Rule Mining

Vladimír Bartík

doi:10.4018/978-1-60566-010-3.ch107

Abstract

Association rules are one of the most frequently used types of knowledge discovered from databases. The problem of discovering association rules was first introduced in (Agrawal, Imielinski & Swami, 1993). Here, association rules are discovered from transactional databases –a set of transactions where a transaction is a set of items. An association rule is an expression of a form A?B where A and B are sets of items. A typical application is market basket analysis. Here, the transaction is the content of a basket and items are products. For example, if a rule milk ? juice ? coffee is discovered, it is interpreted as: “If the customer buys milk and juice, s/he is likely to buy coffee too.” These rules are called single-dimensional Boolean association rules (Han & Kamber, 2001). The potential usefulness of the rule is expressed by means of two metrics – support and confidence. A lot of algorithms have been developed for mining association rules in transactional databases. The best known is the Apriori algorithm (Agrawal & Srikant, 1994), which has many modifications, e.g. (Kotásek & Zendulka, 2000). These algorithms usually consist of two phases: discovery of frequent itemsets and generation of association rules from them. A frequent itemset is a set of items having support greater than a threshold called minimum support. Association rule generation is controlled by another threshold referred to as minimum confidence. Association rules discovered can have a more general form and their mining is more complex than mining rules from transactional databases. In relational databases, association rules are ordinarily discovered from data of one table (it can be the result of joining several other tables). The table can have many columns (attributes) defined on domains of different types. It is useful to distinguish two types of attributes. A categorical attribute (also called nominal) has a finite number of possible values with no ordering among the values (e.g. a country of a customer). A quantitative attribute is a numeric attribute, domain of which is infinite or very large. In addition, it has an implicit ordering among values (e.g. age and salary of a customer). An association rule (Age = [20…30]) ? (Country = “Czech Rep.”) ? (Salary = [1000$...2000$]) says that if the customer is between 20 and 30 and is from the Czech Republic, s/he is likely to earn between 1000$ and 2000$ per month. Such rules with two or more predicates (items) containing different attributes are also called multidimensional association rules. If some attributes of rules are quantitative, the rules are called quantitative association rules (Han & Kamber, 2001). If a table contains only categorical attributes, it is possible to use modified algorithms for mining association rules in transactional databases. The crucial problem is to process quantitative attributes because their domains are very large and these algorithms cannot be used. Quantitative attributes must be discretized. This article deals with mining multidimensional association rules from relational databases, with main focus on distance-based methods. One of them is a novel method developed by the authors.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Distance-Based Methods for Association Rule Mining

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Distance-Based Methods for Association Rule Mining
Vladimír Bartík
-
Vladimír BartíkVladimír Bartík
18 Jan 2011
18 Jan 2011

Implication of association rules employing FP-growth algorithm for knowledge discovery
A.H.M Sajedul Hoque ... Md Al-Amin Bhuiyan
-
A.H.M Sajedul Hoque, et. al.A.H.M Sajedul Hoque ... Md Al-Amin Bhuiyan
01 Dec 2011
01 Dec 2011

Algorithms for Association Rule Mining
Vasudha Bhatnagar ... Naveen Kumar
-
Vasudha Bhatnagar, et. al.Vasudha Bhatnagar ... Naveen Kumar
01 Jan 2009
01 Jan 2009

Comparative analysis of association rule mining algorithms
S Vijayarani ... S Sharmila
-
S Vijayarani, et. al.S Vijayarani ... S Sharmila
01 Aug 2016
01 Aug 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Distance-Based Methods for Association Rule Mining

Abstract

Talk to us

Similar Papers