Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

Jose L Balcazar

doi:10.2168/lmcs-6(2:4)2010

Abstract

Association rules are among the most widely employed data analysis methods in the field of Data Mining. An association rule is a form of partial implication between two sets of binary variables. In the most common approach, association rules are parameterized by a lower bound on their confidence, which is the empirical conditional probability of their consequent given the antecedent, and/or by some other parameter bounds such as "support" or deviation from independence. We study here notions of redundancy among association rules from a fundamental perspective. We see each transaction in a dataset as an interpretation (or model) in the propositional logic sense, and consider existing notions of redundancy, that is, of logical entailment, among association rules, of the form "any dataset in which this first rule holds must obey also that second rule, therefore the second is redundant". We discuss several existing alternative definitions of redundancy between association rules and provide new characterizations and relationships among them. We show that the main alternatives we discuss correspond actually to just two variants, which differ in the treatment of full-confidence implications. For each of these two notions of redundancy, we provide a sound and complete deduction calculus, and we show how to construct complete bases (that is, axiomatizations) of absolutely minimum size in terms of the number of rules. We explore finally an approach to redundancy with respect to several association rules, and fully characterize its simplest case of two partial premises.

Highlights

The relatively recent discipline of Data Mining involves a wide spectrum of techniques, inherited from different origins such as Statistics, Databases, or Machine Learning
We show that the main alternatives we discuss correspond to just two variants, which differ in the treatment of full-confidence implications
A number of formalizations of the intuition of redundancy among association rules exist in the literature, often with proposals for defining irredundant bases. All of these are weaker than the notion that we would consider natural by comparison with implications

Summary

Introduction

The relatively recent discipline of Data Mining involves a wide spectrum of techniques, inherited from different origins such as Statistics, Databases, or Machine Learning. A number of formalizations of the intuition of redundancy among association rules exist in the literature, often with proposals for defining irredundant bases (see [1], [13], [27], [33], [36], [38], [44], the survey [29], and section 6 of the survey [12]) All of these are weaker than the notion that we would consider natural by comparison with implications (of which we start the study in the last section of this paper). In order to reduce further the size without losing information, more powerful notions or redundancy must be deployed We consider for this role the proposal of handling separately, to a given extent, full-confidence implications from lower-than-1-confidence rules, in order to profit from their very different combinatorics.

Preliminaries

Redundancy Notions

Closure-Based Redundancy

Towards General Entailment

Findings

Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Logical Methods in Computer Science	Publication Date: Jun 27, 2010
Citations: 32	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Logical Methods in Computer Science

Lead the way for us

Similar Papers

Constructing Approximate Informative Basis of Association Rules
Kouta Kanda ... Yoshiaki Okubo
-
Kouta Kanda, et. al.Kouta Kanda ... Yoshiaki Okubo
01 Jan 2001
01 Jan 2001

Minimum-Size Bases of Association Rules
José L Balcázar
-
José L BalcázarJosé L Balcázar
15 Sep 2008
15 Sep 2008

Structure of Set of Association Rules Based on Concept Lattice
Tin C Truong ... Anh N Tran
-
Tin C Truong, et. al.Tin C Truong ... Anh N Tran
01 Jan 2009
01 Jan 2009

Association Rule Mining in DoS Attack Detection and Defense in the Application of Network
Jigang Zheng ... Jingmei Zhang
-
Jigang Zheng, et. al.Jigang Zheng ... Jingmei Zhang
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Logical Methods in Computer Science