TECHNOLOGY FOR CREATING A DOMAIN KNOWLEDGE BASE OF QUESTION-ANSWERING SYSTEM BASED ON A LARGE-SCALE UNIVERSAL KNOWLEDGE BASE

Nikita Titov,Sergey Makrushin

doi:10.33693/2313-223x-2022-9-1-115-124

Abstract

Usage of question-answering systems has become a popular way to access knowledge bases that contain a large number of facts from different domains. Large-scale universal knowledge bases with open access, such as Wikidata, contain huge collections of facts. Although those bases cover the most significant part of the accumulated humankind knowledge, there are a number of reasons why their direct usage in question-answering systems can be less preferable than the creation of specialized domain knowledge bases based on them. This paper presents a technology for building a domain knowledge base for a dialogue system based on domain boundaries detection from an open large-scale universal knowledge base. It is based on a multi-step process of analyzing a large number of free-form questions on a given subject area that were collected using a crowdsourcing platform. The technology includes the correction of the original knowledge base ontological structure and its filling with additional content. The proposed technology is invariant to the original knowledge base and the modeling subject area. We tested it on the Wikidata knowledge base and six subject areas.

Full Text