Global and saturated probabilistic approximations based on generalized maximal consistent blocks

Patrick G Clark,Teresa Mroczek,Rafal Niemiec,Jerzy W Grzymala-Busse,Zdzislaw S Hippe

doi:10.1093/jigpal/jzac015

Abstract

Abstract In this paper incomplete data sets, or data sets with missing attribute values, have three interpretations, lost values, attribute-concept values and ‘do not care’ conditions. Additionally, the process of data mining is based on two types of probabilistic approximations, global and saturated. We present results of experiments on mining incomplete data sets using six approaches, combining three interpretations of missing attribute values with two types of probabilistic approximations. We compare our six approaches, using the error rate computed as a result of ten-fold cross validation as a criterion of quality. We show that for some data sets the error rate is significantly smaller (5% level of significance) for lost values, for some data sets the smaller error rate is associated with attribute-concept values, and sometimes with ‘do not care’ conditions. Again, for some approaches the error rate is significantly smaller for saturated probabilistic approximations than for global probabilistic approximations, while for some approaches it is the other way around. Thus, for an incomplete data set, the best approach to data mining should be chosen by trying all six approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Global and saturated probabilistic approximations based on generalized maximal consistent blocks

Abstract

Talk to us

Similar Papers

More From: Logic Journal of the IGPL

Lead the way for us

Similar Papers

Global and Saturated Probabilistic Approximations Based on Generalized Maximal Consistent Blocks
Patrick G Clark ... Zdzislaw S Hippe
-
Patrick G Clark, et. al.Patrick G Clark ... Zdzislaw S Hippe
01 Jan 2020
01 Jan 2020

A Comparison of global and local probabilistic approximations in mining data with many missing attribute values
Patrick G Clark ... Jerzy W Grzymala-Busse
-
Patrick G Clark, et. al.Patrick G Clark ... Jerzy W Grzymala-Busse
01 Dec 2013
01 Dec 2013

Complexity of rule sets in mining incomplete data using characteristic sets and generalized maximal consistent blocks
Patrick G Clark ... Teresa Mroczek
Logic Journal of the IGPL | VOL. 29
Patrick G Clark, et. al.Patrick G Clark ... Teresa Mroczek
18 Sep 2020
Logic Journal of the IGPL | VOL. 29

Mining Incomplete Data Using Global and Saturated Probabilistic Approximations Based on Characteristic Sets and Maximal Consistent Blocks
Patrick G Clark ... Teresa Mroczek
-
Patrick G Clark, et. al.Patrick G Clark ... Teresa Mroczek
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Global and saturated probabilistic approximations based on generalized maximal consistent blocks

Abstract

Talk to us

Similar Papers

More From: Logic Journal of the IGPL