Abstract

We show that state-of-the-art self-supervised language models can be readily used to extract relations from a corpus without the need to train a fine-tuned extractive head. We introduce RE-Flex, a simple framework that performs constrained cloze completion over pretrained language models to perform unsupervised relation extraction. RE-Flex uses contextual matching to ensure that language model predictions matches supporting evidence from the input corpus that is relevant to a target relation. We perform an extensive experimental study over multiple relation extraction benchmarks and demonstrate that RE-Flex outperforms competing unsupervised relation extraction methods based on pretrained language models by up to 27.8 F1 points compared to the next-best method. Our results show that constrained inference queries against a language model can enable accurate unsupervised relation extraction.

Highlights

  • Relation extraction is a fundamental problem in constructing knowledge bases from unstructured text

  • Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1263–1276 November 16 - 20, 2020. c 2020 Association for Computational Linguistics promising, we show that an out-of-box application of these methods to general relation extraction falls short of extractive question answering (QA) models

  • The core limitation is that of factual generation: language models do not memorize general factual information (Petroni et al, 2019), and are liable to predict off-topic or non-factual tokens (See et al, 2017)

Read more

Summary

Introduction

Relation extraction is a fundamental problem in constructing knowledge bases from unstructured text. A key idea behind general relation extraction is to leverage question answering (QA) models and use the reading comprehension capabilities of modern natural language models to identify relation mentions in text. While effective in domains related to the annotated question-answer data, supervised extractive QA approaches can fail to generalize to new domains for which annotations are not available (Dhingra et al, 2018). Given an extractive relational cloze query and an associated context, we propose a method to restrict the model’s answer to the query to be factual information in the associated context. We introduce a context-constrained inference procedure over language models and does not require altering the pre-training algorithm This procedure relies on redistributing the probability mass of the language model’s initial prediction to tokens only present in the context. Our results demonstrate that by constraining language generation, RE-Flex yields accurate unsupervised relation extractions

Related Work
Initialize Cloze Templates
Problem Statement
The RE-Flex Framework
Context Rejection
Relation Extraction
Experimental Evaluation
Datasets and Benchmarks
Metrics
Defining cloze templates
Competing Methods
LAMA Benchmarks
General Relation Benchmarks
Method
Conclusion
A Implementation Details
B Qualitative Results
C Dataset Details
D Competing Methods Implementation Details
E Microbenchmarks
Minutes
F Biases of QA Models
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.