Abstract
Distant supervised relation extraction has been widely used to find novel relational facts from unstructured text. As far as we know, nearly all existing relation extraction models assume that each sentence contains precisely one entity pair, i.e., two entities. However, in reality, the datasets constructed by distant supervision have lots of sentences which contain repeated entities. In other words, there may be more than two entities in a sentence. This phenomenon breaks the assumption of existing models and makes them inevitably encounter the attention bias problem; that is, the model focuses on the wrong entities during relation extraction. To alleviate this problem, in this paper, we utilize the idea of ensemble learning and propose a novel distant supervised relation extraction model. The proposed model follows the multi-instance multi-label learning mechanism and conducts relation extraction based on the sentence-bag representations. Specifically, it first tries to identify the most critical entity and keywords, and then it uses voting mechanism to determine the sentence-level and bag-level features. Experimental results show that our proposed model outperforms the state-of-the-art baselines in relation extraction on a popular benchmark dataset, and it also indicates that the proposed model can indeed alleviate the problem caused by repeated-entity phenomenon.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have