Abstract
The automatic extraction of meaningful relations from biomedical literature or clinical records is crucial in various biomedical applications. Most of the current deep learning approaches for medical relation extraction require large-scale training data to prevent overfitting of the training model. We propose using a pre-trained model and a fine-tuning technique to improve these approaches without additional time-consuming human labeling. Firstly, we show the architecture of Bidirectional Encoder Representations from Transformers (BERT), an approach for pre-training a model on large-scale unstructured text. We then combine BERT with a one-dimensional convolutional neural network (1d-CNN) to fine-tune the pre-trained model for relation extraction. Extensive experiments on three datasets, namely the BioCreative V chemical disease relation corpus, traditional Chinese medicine literature corpus and i2b2 2012 temporal relation challenge corpus, show that the proposed approach achieves state-of-the-art results (giving a relative improvement of 22.2, 7.77, and 38.5% in F1 score, respectively, compared with a traditional 1d-CNN classifier). The source code is available at https://github.com/chentao1999/MedicalRelationExtraction.
Highlights
Medical relations, such as chemical disease relations (CDRs) and chemical protein relations in modern medicine, herb-syndrome relations and formula-disease relations in traditional medicine, play a key role in a number of biomedical-related applications, e.g. clinical decisionmaking, drug discovery and drug side-effect detection.Manually extracting these relations is difficult and timeconsuming
We focus on pre-training models from unstructured text and fine-tuning the pre-trained models to improve the performance of existing deep learningbased medical relation extraction approaches with limited training data
This corresponds to relative improvements of 6.65, 5.26, 7.53, 12.5 and 7.09% compared with the ‘1d-convolutional neural networks (CNNs) with general embeddings’ approach, respectively. These results indicate that using the pretrained model and fine-tuning technique can improve the performance of the 1d-CNN classifier for traditional Chinese medicine (TCM) relation extraction
Summary
Medical relations, such as chemical disease relations (CDRs) and chemical protein relations in modern medicine, herb-syndrome relations and formula-disease relations in traditional medicine, play a key role in a number of biomedical-related applications, e.g. clinical decisionmaking, drug discovery and drug side-effect detection. Extracting these relations is difficult and timeconsuming. Shallow machine learning approaches consider medical relation extraction as a classification problem and generally use supervised learning and feature engineering to obtain high performance. These approaches require manually constructed features or rules. Among current deep learning approaches, convolutional neural networks (CNNs) are one of the key drivers of improvements [8]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.