Abstract
BackgroundIn biomedical research, chemical and disease relation extraction from unstructured biomedical literature is an essential task. Effective context understanding and knowledge integration are two main research problems in this task. Most work of relation extraction focuses on classification for entity mention pairs. Inspired by the effectiveness of machine reading comprehension (RC) in the respect of context understanding, solving biomedical relation extraction with the RC framework at both intra-sentential and inter-sentential levels is a new topic worthy to be explored. Except for the unstructured biomedical text, many structured knowledge bases (KBs) provide valuable guidance for biomedical relation extraction. Utilizing knowledge in the RC framework is also worthy to be investigated. We propose a knowledge-enhanced reading comprehension (KRC) framework to leverage reading comprehension and prior knowledge for biomedical relation extraction. First, we generate questions for each relation, which reformulates the relation extraction task to a question answering task. Second, based on the RC framework, we integrate knowledge representation through an efficient knowledge-enhanced attention interaction mechanism to guide the biomedical relation extraction.ResultsThe proposed model was evaluated on the BioCreative V CDR dataset and CHR dataset. Experiments show that our model achieved a competitive document-level F1 of 71.18% and 93.3%, respectively, compared with other methods.ConclusionResult analysis reveals that open-domain reading comprehension data and knowledge representation can help improve biomedical relation extraction in our proposed KRC framework. Our work can encourage more research on bridging reading comprehension and biomedical relation extraction and promote the biomedical relation extraction.
Highlights
In biomedical research, chemical and disease relation extraction from unstructured biomedical literature is an essential task
To make full use of the pretrained language model (LM) and knowledge representation, this paper proposes a knowledge-enhanced reading comprehension (RC) model based on pretrained LMs to improve biomedical relation extraction
Through experiments, we demonstrate the effectiveness of using open-domain reading comprehension data and knowledge information in our proposed RC framework for biomedical relation extraction
Summary
Chemical and disease relation extraction from unstructured biomedical literature is an essential task. Based on the RC framework, we integrate knowledge representation through an efficient knowledge-enhanced attention interaction mechanism to guide the biomedical relation extraction. Chemical, disease, and their relations play an important role in biomedical research [1] and relation extraction is an essential task in biomedical text information extraction. As described by [3], 30% of relations in the Biocreative V CDR data are expressed across more than one sentence As an example, it shows the title and abstract of a document containing two chemical-induced disease pairs (D005445, D004244) and (D005445, D010146). Chemical ‘flunitrazepam’ and disease ‘pain’ appear in the same sentence, while chemical ‘flunitrazepam’ and disease ‘dizziness’ are expressed across sentence boundaries
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have