Abstract

Matrix factorization of knowledge bases in universal schema has facilitated accurate distantlysupervised relation extraction. This factorization encodes dependencies between textual patterns and structured relations using lowdimensional vectors defined for each entity pair; although these factors are effective at combining evidence for an entity pair, they are inaccurate on rare pairs, or for relations that depend crucially on the entity types. On the other hand, tensor factorization is able to overcome these shortcomings when applied to link prediction by maintaining entity-wise factors. However these models have been unsuitable for universal schema. In this paper we first present an illustration on synthetic data that explains the unsuitability of tensor factorization to relation extraction with universal schemas. Since the benefits of tensor and matrix factorization are complementary, we then investigate two hybrid methods that combine the benefits of the two paradigms. We show that the combination can be fruitful: we handle ambiguously phrased relations, achieve gains in accuracy on real-world relations, and demonstrate that entity embeddings encode entity types.

Highlights

  • Distantly-supervised relation extraction has gained prominence as it utilizes automatically aligned data to train accurate extractors

  • We explore the application of matrix and tensor factorization for universal schema data

  • We present improved accuracy on real-world relation extraction data, and demonstrate that the entity embeddings are effective at encoding entity types

Read more

Summary

Introduction

Distantly-supervised relation extraction has gained prominence as it utilizes automatically aligned data to train accurate extractors. An important shortcoming of this matrix factorization model for universal schema is that no information is shared between the rows that contain the same entity This can significantly impact accuracy on pairs of entities that are not mentioned together frequently, and for relations that depend crucially on fine-grained entity types, such as schoolAttended, nationality, and bookAuthor. Tensor factorization for knowledge-base completion maintains perentity factors that combine evidence from all the relations an entity participates in, to predict its relations to other entities – a task known as link prediction (Nickel et al, 2012; Bordes et al, 2013) These entity factors, as opposed to pairwise factors in matrix factorization, can be quite effective in identifying the latent, fine-grained entity types. We present improved accuracy on real-world relation extraction data, and demonstrate that the entity embeddings are effective at encoding entity types

Universal Schema
Matrix Factorization with Factors over Entity-Pairs
Tensor Factorization with Entity Factors
Tucker2 Decomposition and RESCAL
TransE
Model E
Combined Tensor and Matrix Factorization for Universal Schema
Illustration Using Synthetic Relations
Hybrid Factorization Models
Parameter Estimation
Experiments
Synthetic RGB Relations
Universal Schema Relation Extraction
Entity Embeddings and Types
Findings
Conclusions and Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call