Abstract

While systems for question answering over knowledge bases (KB) continue to progress, real world usage requires systems that are robust to incomplete KBs. Dependence on the closed world assumption is highly problematic, as in many practical cases the information is constantly evolving and KBs cannot keep up. In this paper we formalize a typology of missing information in knowledge bases, and present a dataset based on the Spider KB question answering dataset, where we deliberately remove information from several knowledge bases, in this case implemented as relational databases (The dataset and the code to reproduce experiments are available at https://github.com/camillepradel/IDK.). Our dataset, called IDK (Incomplete Data in Knowledge base question answering), allows to perform studies on how to detect and recover from such cases. The analysis shows that simple baselines fail to detect most of the unanswerable questions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.