Abstract

Automatic extraction of relations between gene mutations and cancer entities occurring in the cancer literature using text mining can rapidly provide vital information to support precision cancer medicine. However, mutation-cancer relation extraction is more challenging than general relation extraction from free text, since it is often not possible without cancer-specific background knowledge and thus the model replies on a deeper understanding of complex surrounding tokens. We propose a deep learning model that jointly extracts mutations and their associated cancers. Background knowledge comes from two different knowledge bases which store different types of information about mutations. Given the different ways in which knowledge is stored in these two resources, we propose two separate methods for embedding knowledge, namely sentence-based knowledge integration and attribute-aware knowledge integration. The evaluation demonstrated that our model outperforms a number of baseline models and gains 96.00%, 92.57% and 94.57% F1 scores on three public datasets, EMU BCa, EMU PCa, and BRONCO, thus illustrating the effectiveness of our knowledge integration approach. The auxiliary experiments show that our models can utilize more informative text from the KBs and link the mutations to their corresponding cancer disease although the input text provides insufficient context.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call