Informed Selection of Training Examples for Knowledge Refinement

Nirmalie Wiratunga,Susan Craw

doi:10.1007/3-540-39967-4_17

Abstract

Knowledge refinement tools rely on a representative set of training examples to identify and repair faults in a knowledge based system (KBS). In real environments it is often dificult to obtain a large set of examples since each problem-solving task must be labelled with the expert’s solution. However, it is often somewhat easier to generate unlabelled tasks that cover the expertise of a KBS. This paper investigates ways to select a suitable sample from a set of unlabelled problem-solving tasks, so that only the subset requires to be labelled. The unlabelled examples are clustered according to the way they are solved by the KBS and selection is targeted on these clusters. Experiments in two domains showed that selective sampling reduced the number of training examples used for refinement, and hence requiring to be labelled. Moreover, this reduction was possible without affecting the accuracy of the final refined KBS. A single example selected randomly from each cluster was effective in one domain, but the other required a more informed selection that takes account of potentially conflicting repairs.

Full Text