Abstract

Mining incomplete data using approximations based on characteristic sets is a well-established technique. It is applicable to incomplete data sets with a few interpretations of missing attribute values, e.g., lost values and “do not care” conditions. On the other hand, maximal consistent blocks were introduced for incomplete data sets with only “do not care” conditions, using only lower and upper approximations. In this paper we introduce an extension of the maximal consistent blocks to incomplete data sets with any interpretation of missing attribute values and with probabilistic approximations. We prove new results on probabilistic approximations based on generalized maximal consistent blocks. Additionally, we present results of experiments on mining incomplete data using both characteristic sets and maximal consistent blocks and using two interpretations of missing attribute values: lost values and “do not care” conditions. We show that there is some evidence that the best approach is using middle probabilistic approximations based on characteristic sets or on maximal consistent blocks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.