Most information retrieval research focuses collecting documents that match the same set of concepts. This study considers a more advanced problem, namely how to discover knowledge not contained in a single source from combined historical facts. By using a well-designed core ontology in the cultural domain (CIDOC CRM, ISO21127), this study discusses the requirement for a robust inference platform for real-life knowledge discovery and integration over distributed sources. The methodology and design are justified in detail through functional requirements for an inference service with the capability of inferring new knowledge from combinations of facts distributed over different sources. A number of critical issues for developing such a robust inference platform are identified, namely (1) systematic accumulation of common concepts and inference rules; (2) extending the ontology with metaclasses; (3) accumulation of factual and categorical knowledge; (4) incorporation of fuzzy inference into the inference engine, and (5) improvement of performance and scalability in the inference engine.
Read full abstract