Abstract

Cubes and association rules discover frequent patterns in a data set, most of which are not significant. Thus previous research has introduced search constraints and statistical metrics to discover significant patterns and reduce processing time. We introduce cube pairs comparing cube groups based on a parametric statistical test and rule pairs based on two similar association rules, which are pattern pair generalizations of cubes and association rules, respectively. We introduce algorithmic optimizations to discover comparable pattern sets. We carefully study why both techniques agree or disagree on the validity of specific pairs, considering p-value for statistical tests, as well as confidence for association rules. In addition, we analyze the probabilistic distribution of target attributes given confidence thresholds. We also introduce a reliability metric based on cross-validation, which enables an objective comparison between both patterns. We present an extensive experimental evaluation with real data sets to understand significance and reliability of pattern pairs. We show cube pairs generally produce more reliable results than rule pairs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.