Abstract

The task of knowledge base population (KBP) aims to discover facts about entities from texts and expand a knowledge base with these facts. Previous studies shape end-to-end KBP as a machine translation task, which is required to convert unordered fact into a sequence according to a pre-specified order. However, the facts stated in a sentence are unordered in essence. In this paper, we formulate end-to-end KBP as a direct set generation problem, avoiding considering the order of multiple facts. To solve the set generation problem, we propose networks featured by transformers with non-autoregressive parallel decoding. Unlike previous approaches that use an autoregressive decoder to generate facts one by one, the proposed networks can directly output the final set of facts in one shot. Furthermore, to train the networks, we also design a set-based loss that forces unique predictions via bipartite matching. Compared with cross-entropy loss that highly penalizes small shifts in fact order, the proposed bipartite matching loss is invariant to any permutation of predictions. Benefiting from getting rid of the burden of predicting the order of multiple facts, our proposed networks achieve state-of-the-art (SoTA) performance on two benchmark datasets.

Highlights

  • Relation extraction aims to predict semantic relations between pairs of entities

  • Knowledge bases (KBs) are valuable resources, which can provide back-end support for various knowledge-centric services of real-world in practice, this pipeline architecture is inherently prone to error propagation between its components (Trisedya et al, 2019)

  • We formulate the end-to-end knowledge base population (KBP) learning of word and entity embeddings in Section task as a set generation problem, avoiding con- 2.1, which are the basis of the proposed networks

Read more

Summary

Introduction

Relation extraction aims to predict semantic relations between pairs of entities. Though widely used. The goal of end-to-end KBP is to identify all possible facts Y = {< h1, r1, t1 > , ..., < hn, rn, tn > } stated in a given sentence X to enrich the given reference KB We formulate the end-to-end KBP learning of word and entity embeddings in Section task as a set generation problem, avoiding con- 2.1, which are the basis of the proposed networks. In order we introduce the sentence encoder in Section to address the set generation problem, we pro- 2.2, which can represent each token in a given senpose end-to-end networks, dubbed “Set Genera- tence based on its bidirectional context. The output of the transformerbased encoder is denoted as He ∈ Rl×d, where l is the sentence length and d is the output dimension of the transformer-based encoder

Set Generator
Datasets and Evaluation Metrics
Implementation Details
Ablation Studies
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.