Abstract

Reasoning over commonsense knowledge bases (CSKB) whose elements are in the form of free-text is an important yet hard task in NLP. While CSKB completion only fills the missing links within the domain of the CSKB, CSKB population is alternatively proposed with the goal of reasoning unseen assertions from external resources. In this task, CSKBs are grounded to a large-scale eventuality (activity, state, and event) graph to discriminate whether novel triples from the eventuality graph are plausible or not. However, existing evaluations on the population task are either not accurate (automatic evaluation with randomly sampled negative examples) or of small scale (human annotation). In this paper, we benchmark the CSKB population task with a new large-scale dataset by first aligning four popular CSKBs, and then presenting a high-quality human-annotated evaluation set to probe neural models' commonsense reasoning ability. We also propose a novel inductive commonsense reasoning model that reasons over graphs. Experimental results show that generalizing commonsense reasoning on unseen assertions is inherently a hard task. Models achieving high accuracy during training perform poorly on the evaluation set, with a large gap between human performance. We will make the data publicly available for future contributions. Codes and data are available at this https URL.

Highlights

  • Commonsense reasoning is one of the core problems in the field of artificial intelligence

  • Denote the source commonsense knowledge bases (CSKBs) about events as C = {(h, r, t)|h ∈ H, r ∈ R, t ∈ T }, where H, R, and T are the set of the commonsense heads, relations, and tails

  • The training of CSKB population can inherit the setting of triple classification, where ground truth examples are from the CSKB C and negative triples are randomly sampled

Read more

Summary

Introduction

Commonsense reasoning is one of the core problems in the field of artificial intelligence. Inample, in ATOMIC (Sap et al, 2019), a human- stead of annotating every possible node pair in the annotated if- commonsense knowledge base graph, which takes an infeasible O(|V |2) amount among daily events and (mental) states, the av- of annotation, we sample a large subset of canerage hops between matched heads and tails in didate edges grounded in ASER to annotate. Since the proposal of Cyc (Lenat, 1995) and ConceptNet (Liu and Singh, 2004; Speer et al, 2017), a growing number of large-scale human-annotated CSKBs are developed (Sap et al, 2019; Bisk et al, 2020; Sakaguchi et al, 2020; Mostafazadeh et al, 2020; Forbes et al, 2020; Lourie et al, 2020; To effectively and accurately evaluate CSKB Hwang et al, 2020; Ilievski et al, 2020).

Task Definition
Selection of CSKBs
Result
Evaluation Set Preparation point Likert scale
Quality Control
Experiments
Main Results
Zero-shot Setting
A Additional Details of Commonsense Relations
Examples of Format Unification
Original Test Set
Model Details
Examples of Populated Triples Examples of the annotations of the populated
Neighboring Function N
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.