A Semieager Classifier for Open Relation Extraction

Peiqian Liu,Xiaojie Wang

doi:10.1155/2018/4929674

Abstract

A variety of open relation extraction systems have been developed in the last decade. And deep learning, especially with attention model, has gained much success in the task of relation classification. Nevertheless, there is, yet, no research reported on classifying open relation tuples to our knowledge. In this paper, we propose a novel semieager learning algorithm (SemiE) to tackle the problem of open relation classification. Different from the eager learning approaches (e.g., ANNs) and the lazy learning approaches (e.g., kNN), the SemiE offers the benefits of both categories of learning scheme, with its significantly lower computational cost (O(n)). This algorithm can also be employed in other classification tasks. Additionally, this paper presents an adapted attention model to transform relation phrases into vectors by using word embedding. The experimental results on two benchmark datasets show that our method outperforms the state-of-the-art methods in the task of open relation classification.

Highlights

Information Extraction (IE) is the task of collecting structured information automatically from large size of unstructured data by learning an extractor from labeled training examples for each target relation [1,2,3]
Researchers at the University of Washington pioneered a new paradigm of open relation extraction (ORE), which enables the extraction of arbitrary relations from sentences by automatic identification of relation phrases, obviating the restriction to a prespecified vocabulary
Because ORE systems rely on unsupervised extraction strategies, these datasets generally consist only of unlabeled natural language sentences that cannot be used for training a supervised model, such as the semieager learning algorithm (SemiE)

Summary

Introduction

Information Extraction (IE) is the task of collecting structured information automatically from large size of unstructured data by learning an extractor from labeled training examples for each target relation [1,2,3]. The best performing neural-embedding models are NTN [21] and Bordes models (TransE and TATEC) [22, 23], which extend the traditional relation classification task to semantic relation classification Different from those approaches built over lexical and distributional word vector features, Siamak B et al proposed a model using the combination of large commonsense knowledge bases of binary relations for the composite semantic relation classification problem [24]. These approaches for traditional relation classification are unsuitable for targeting the variety and complexity of open relation types on the Web, since ORE systems are strongly in favor of speed and these approaches are time consuming.

Related Works

The Proposed Algorithms

Experimental Results and Analysis

Conclusion