Rough Problem Settings for ILP Dealing With Imperfect Data

Chunnian Liu,Ning Zhong

doi:10.1111/0824-7935.00157

Abstract

This paper applies rough set theory to Inductive Logic Programming (ILP, a relatively new method in machine learning) to deal with imperfect data occurring in large real‐world applications. We investigate various kinds of imperfect data in ILP and propose rough problem settings to deal with incomplete background knowledge (where essential predicates/clauses are missing), indiscernible data (where some examples belong to both sets of positive and negative training examples), missing classification (where some examples are unclassified) and too strong declarative bias (hence the failure in searching for solutions). The rough problem settings relax the strict requirements in the standard normal problem setting for ILP, so that rough but useful hypotheses can be induced from imperfect data. We give simple measures of learning quality for the rough problem settings. For other kinds of imperfect data (noise data, too sparse data, missing values, real‐valued data, etc.), while referring to their traditional handling techniques, we also point out the possibility of new methods based on rough set theory.

Full Text