Redundancy detection in semistructured case bases

K Racine,Qiang Yang Qiang Yang

doi:10.1109/69.929905

Abstract

With the dramatic proliferation of case-based reasoning systems in commercial applications, many case bases are now becoming legacy systems. They represent a significant portion of an organization's assets, but they are large and difficult to maintain. One of the contributing factors is that these case bases are often large and yet unstructured or semistructured; they are represented in natural language text. Adding to the complexity is the fact that the case bases are often authored and updated by different people from a variety of knowledge sources, making it highly likely for a case base to contain redundant and inconsistent knowledge. We present methods and a system for maintaining large and semistructured case bases. We focus on a difficult problem in case base maintenance: redundancy detection. This problem is particularly pervasive when one deals with a semistructured case base. We discuss an information retrieval-based algorithm and an implemented system for solving this problem. As the ability to contain the knowledge acquisition problem is of paramount importance, our method allows one to express relevant domain expertise for detecting redundancy naturally and effortlessly. Empirical evaluations of the system demonstrate the effectiveness of the methods in several large domains.

Full Text