Abstract

Machine learning can extract desired knowledge from existing training examples and ease the development bottleneck in building expert systems. Most learning approaches derive rules from complete data sets. If some attribute values are unknown in a data set, it is called incomplete. Learning from incomplete data sets is usually more difficult than learning from complete data sets. In the past, the rough-set theory was widely used in dealing with data classification problems. Most conventional mining algorithms based on the rough-set theory identify relationships among data using crisp attribute values. Data with quantitative values, however, are commonly seen in real-world applications. In this paper, we thus deal with the problem of learning from incomplete quantitative data sets based on rough sets. A learning algorithm is proposed, which can simultaneously derive certain and possible fuzzy rules from incomplete quantitative data sets and estimate the missing values in the learning process. Quantitative values are first transformed into fuzzy sets of linguistic terms using membership functions. Unknown attribute values are then assumed to be any possible linguistic terms and are gradually refined according to the fuzzy incomplete lower and upper approximations derived from the given quantitative training examples. The examples and the approximations then interact on each other to derive certain and possible rules and to estimate appropriate unknown values. The rules derived can then serve as knowledge concerning the incomplete quantitative data set.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.