Abstract

This paper presents a new methodology to improve the quality of rule sets. We performed a series of data mining experiments on completely specified data sets. In these experiments we removed some specified attribute values, or, in different words, replaced such specified values by symbols of missing attribute values, and used these data for rule induction while original, complete data sets were used for testing. In our experiments we used the MLEM2 rule induction algorithm of the LERS data mining system, based on rough sets. Our approach to missing attribute values was based on rough set theory as well. Results of our experiments show that for some data sets and some interpretation of missing attribute values, the error rate was smaller than for the original, complete data sets. Thus, rule sets induced from some data sets may be improved by increasing incompleteness of data sets. It appears that by removing some attribute values, the rule induction system, forced to induce rules from remaining information, may induce better rule sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call