Sampling frequent and minimal boolean patterns: theory and application in classification

Geng Li,Mohammed J Zaki

doi:10.1007/s10618-015-0409-y

Sampling frequent and minimal boolean patterns: theory and application in classification

Geng Li, Mohammed J Zaki

https://doi.org/10.1007/s10618-015-0409-y

Copy DOI

Journal: Data Mining and Knowledge Discovery	Publication Date: Mar 12, 2015
Citations: 7

Affiliation: Rensselaer Polytechnic Institute, Qatar Airways (Qatar)

#Disjunctive Normal Form #Boolean Patterns + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We tackle the challenging problem of mining the simplest Boolean patterns from categorical datasets. Instead of complete enumeration, which is typically infeasible for this class of patterns, we develop effective sampling methods to extract a representative subset of the minimal Boolean patterns in disjunctive normal form (DNF). We propose a novel theoretical characterization of the minimal DNF expressions, which allows us to prune the pattern search space effectively. Our approach can provide a near-uniform sample of the minimal DNF patterns. We perform an extensive set of experiments to demonstrate the effectiveness of our sampling method. We also show that minimal DNF patterns make effective features for classification.

Full Text