Validation-Based Normalization and Selection of Interestingness Measures for Association Rules

Waleed Aljandal,Doina Caragea,William Hsu,Tim Weninger,Vikas Bahirwani

doi:10.1115/1.802823.paper65

Abstract

We investigate the problem of tuning and selecting among interestingness measures for association rules. We first derive a parametric normalization factor for such measures that addresses imbalanced itemset sizes, and show how it can be generalized across many previously derived measures. Next, we develop a validationbased framework for both the normalization and selection tasks, based upon mutual information measures over attributes. We then apply this framework to market basket data and user profile data in weblogs, to automatically choose among or fine-tune alternative measures for generating and ranking rules. Finally, we show how the derived normalization factor can significantly improve the sensitivity of interestingness measures when used for pure association rule mining and also for a classification task. We also consider how this data-driven approach can be used for fusion of association rule sets: either those elicited from subject matter experts, or those found using prior background knowledge. INTRODUCTION One of the most important aspects of association rule mining is ranking rules by their significance, according to some quantitative measure that expresses their interestingness with respect to a decision support or associative reasoning task. Rules take the form X → Y, where both X and Y are subsets of an observed itemset L = {I1, I2, ..., Ik}. Two well-known measures for association rule interestingness are the support, P(X) and the confidence, P(Y | X). These probabilistic measures have been used with other statistical formulae to derive compound measures used in discovering the most significant rule. One limitation of existing binary measures of rule interestingness is that they do not account for the relative size of the itemsets to which each candidate pair of associated subsets (X, Y) belongs. Moreover, there are some hidden associations related to candidates appearing in small groups. Thus, giving some attention and weight to these small groups may lead us to a different relationship perspective. This kind of data behavior can be seen, for example, in social network data where each user record consists of features such as interests, communities, schools attended, etc. In particular, user’s list of interests, each of which corresponds to a list of interest holders. Some interests such as “DNA replication” have low membership; whether this is because the interests are less popular or more specialized, it often suggests a more significant association between users naming them than between those who have interests such as “Music” or “Games” in

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Validation-Based Normalization and Selection of Interestingness Measures for Association Rules

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Interestingness Measures for Association Rules
Yun Sing Koh ... Nathan Rountree
-
Yun Sing Koh, et. al.Yun Sing Koh ... Nathan Rountree
01 Jan 2008
01 Jan 2008

Interestingness measures for association rules: Combination between lattice and hash tables
Bay Vo ... Bac Le
Expert Systems with Applications | VOL. 38
Bay Vo, et. al.Bay Vo ... Bac Le
16 Mar 2011
Expert Systems with Applications | VOL. 38

Interestingness measures for association rules within groups
Aí Da Jiménez ... Fernando Berzal
Intelligent Data Analysis | VOL. 17
Aí Da Jiménez, et. al.Aí Da Jiménez ... Fernando Berzal
17 Apr 2013
Intelligent Data Analysis | VOL. 17

A Framework for Interestingness Measures for Association Rules with Discrete and Continuous Attributes Based on Statistical Validity
Izwan Nizal Mohd Shaharanee ... Jastini Mohd Jamil
-
Izwan Nizal Mohd Shaharanee, et. al.Izwan Nizal Mohd Shaharanee ... Jastini Mohd Jamil
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Validation-Based Normalization and Selection of Interestingness Measures for Association Rules

Abstract

Talk to us

Similar Papers