Abstract

BackgroundThe effective combination of texts and knowledge may improve performances of natural language processing tasks. For the recognition of chemical-induced disease (CID) relations which may span sentence boundaries in an article, although existing CID systems explored the utilization for knowledge bases, the effects of different knowledge on the identification of a special CID haven’t been distinguished by these systems. Moreover, systems based on neural network only constructed sentence or mention level models.ResultsIn this work, we proposed an effective document level neural model integrated domain knowledge to extract CID relations from biomedical articles. Basic semantic information of an article with respect to a special CID candidate pair was learned from the document level sub-network module. Furthermore, knowledge attention depending on the representation of the article was proposed to distinguish the influences of different knowledge on the special CID pair and then the final representation of knowledge was formed by aggregating weighed knowledge. Finally, the integrated representations of texts and knowledge were passed to a softmax classifier to perform the CID recognition. Experimental results on the chemical-disease relation corpus proposed by BioCreative V show that our proposed system integrated knowledge achieves a good overall performance compared with other state-of-the-art systems.ConclusionsExperimental analyses demonstrate that the introduced attention mechanism on domain knowledge plays a significant role in distinguishing influences of different knowledge on the judgment for a special CID relation.

Highlights

  • The effective combination of texts and knowledge may improve performances of natural language processing tasks

  • support vector machine (SVM)-based systems [7,8,9, 11] took advantages of knowledge either as features of equal importance or as Boolean features, while the NN-based system [17] concatenated one-hot representations of knowledge as a feature of the model indiscriminately. Because these relations in Comparative Toxicogenomics Database1 (CTD) are in nature different from each other, it is impossible for them to make the same contribution to assisting a classifier to recognize a chemical-induced disease (CID) relation

  • Because of the above mentioned two reasons, we explored the issue of how to distinguish the influences of different knowledge on the judgment of a special CID relation when knowledge is used as features to incorporate into a NN-based model

Read more

Summary

Results

We proposed an effective document level neural model integrated domain knowledge to extract CID relations from biomedical articles. Basic semantic information of an article with respect to a special CID candidate pair was learned from the document level sub-network module. Knowledge attention depending on the representation of the article was proposed to distinguish the influences of different knowledge on the special CID pair and the final representation of knowledge was formed by aggregating weighed knowledge. The integrated representations of texts and knowledge were passed to a softmax classifier to perform the CID recognition. Experimental results on the chemical-disease relation corpus proposed by BioCreative V show that our proposed system integrated knowledge achieves a good overall performance compared with other state-of-the-art systems

Conclusions
Background
Methods
Results and discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call