Abstract

Relation extraction is a typical method to extend the knowledge graph (KG); nevertheless, when it is applied in a particular domain, the issue of text sparsity becomes noteworthy. As a remedy, distant supervision is introduced to mitigate the issue, which, however, brings about noise simultaneously. The two issues have been notorious in Chinese domain, since Chinese KGs are less developed in comparison with popular languages like English. To tackle the challenge, we propose a complementary convolution neural network (com-CNN) with attentional multiple instance learning (MIL) to obtain highly comprehensive features for extracting relations, and alleviate the negative effect caused by sentence-level noise. Our model com-CNN fully captures information from two different representations of a relation instance, raw word sequence (RWS) and multiple dependency path (MDP), and enables them to complement each other. To achieve better combination of RWS and MDP, we design a flexible feature fusion method. To mitigate the over-fitting problem of attention mechanism, which is caused by sparse texts, entity information is employed to guide the computation of attention scores for multiple instances in a bag to alleviate the impact of the wrongly labelled data. Experiments on Chinese relation extraction show that our proposal outperforms state-of-the-art approaches, and that the combination of RWS and MDP can generate more representative features for relation extraction. Besides, the empirical results validate our intention that entity-integrated attentional MIL offers the best performance in denoising for sparse domain texts compared with alternatives.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call