Abstract

Commonsense knowledge (CSK) about concepts and their properties is useful for AI applications such as robust chatbots. Prior works like ConceptNet, TupleKB and others compiled large CSK collections, but are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and monolithic strings for P and O. Also, these projects have either prioritized precision or recall, but hardly reconcile these complementary goals. This paper presents a methodology, called Ascent, to automatically build a large-scale knowledge base (KB) of CSK assertions, with advanced expressiveness and both better precision and recall than prior works. Ascent goes beyond triples by capturing composite concepts with subgroups and aspects, and by refining assertions with semantic facets. The latter are important to express temporal and spatial validity of assertions and further qualifiers. Ascent combines open information extraction with judicious cleaning using language models. Intrinsic evaluation shows the superior size and quality of the Ascent KB, and an extrinsic evaluation for QA-support tasks underlines the benefits of Ascent. A web interface, data and code can be found at https://ascent.mpi-inf.mpg.de/.

Highlights

  • Among the automatically-constructed knowledge base (KB), our KB has the most salient assertions while demonstrating competitive quality when it comes to typicality. These results indicate that our source selection, filtering and extraction scheme allows to pull out important assertions better than other commonsense knowledge bases (CSKBs)

  • We can see that all KBs contribute contexts that improve language models (LMs) response quality

  • Ascent performs significantly better than the no-context baseline in both FG, GG and MP settings (p-values of paired t-test below 0.013), Besides, in the span prediction (SP) setting, where answers come directly from retrieved contexts, Ascent outperforms all competitors, indicating that our assertions have very high quality compared to other KBs – with statistically significant gains over TupleKB on both metrics, and Quasimodo on correctness

Read more

Summary

Introduction

Commonsense knowledge (CSK) is a long-standing goal of AI [14, 26, 33]: equip machines with structured knowledge about everyday concepts and their properties (e.g., elephants are big and eat plants, buses carry passengers and drive on roads) and about typical human behavior and emotions (e.g., children love visiting zoos, children enter buses to go to school). Research on automatic acquisition of CSK assertions has been greatly advanced and several commonsense knowledge bases (CSKBs) of considerable size have been constructed (see, e.g., [35, 46, 53, 55]). Use cases for CSK include language-centric tasks such as question answering and conversational systems (see, e.g., [27, 28, 59]). Examples: Question-answering systems often need CSK as background knowledge for robust answers. When a child asks “Which zoos have habitats for T-Rex dinosaurs?”, the system

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call