Efficient Redundancy Reduced Subgroup Discovery via Quadratic Programming

Rui Li,Stefan Kramer

doi:10.1007/978-3-642-33492-4_12

Abstract

Subgroup discovery is a task at the intersection of predictive and descriptive induction, aiming at identifying subgroups that have the most unusual statistical (distributional) characteristics with respect to a property of interest. Although a great deal of work has been devoted to the topic, one remaining problem concerns the redundancy of subgroup descriptions, which often effectively convey very similar information. In this paper, we propose a quadratic programming based approach to reduce the amount of redundancy in the subgroup rules. Experimental results on 12 datasets show that the resulting subgroups are in fact less redundant compared to standard methods. In addition, our experiments show that the computational costs are significantly lower than the one of other methods compared in the paper.Keywordssubgroup discoverymutual informationquadratic programmingrule learningredundancy

Full Text