Document-Level Biomedical Relation Extraction Leveraging Pretrained Self-Attention Structure and Entity Replacement: Algorithm and Pretreatment Method Validation Study.

Xiaofeng Liu,Shoubin Dong,Jianye Fan

doi:10.2196/17644

Xiaofeng Liu, Shoubin Dong + Show 1 more

Open Access

PDF Available

https://doi.org/10.2196/17644

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundThe most current methods applied for intrasentence relation extraction in the biomedical literature are inadequate for document-level relation extraction, in which the relationship may cross sentence boundaries. Hence, some approaches have been proposed to extract relations by splitting the document-level datasets through heuristic rules and learning methods. However, these approaches may introduce additional noise and do not really solve the problem of intersentence relation extraction. It is challenging to avoid noise and extract cross-sentence relations.ObjectiveThis study aimed to avoid errors by dividing the document-level dataset, verify that a self-attention structure can extract biomedical relations in a document with long-distance dependencies and complex semantics, and discuss the relative benefits of different entity pretreatment methods for biomedical relation extraction.MethodsThis paper proposes a new data preprocessing method and attempts to apply a pretrained self-attention structure for document biomedical relation extraction with an entity replacement method to capture very long-distance dependencies and complex semantics.ResultsCompared with state-of-the-art approaches, our method greatly improved the precision. The results show that our approach increases the F1 value, compared with state-of-the-art methods. Through experiments of biomedical entity pretreatments, we found that a model using an entity replacement method can improve performance.ConclusionsWhen considering all target entity pairs as a whole in the document-level dataset, a pretrained self-attention structure is suitable to capture very long-distance dependencies and learn the textual context and complicated semantics. A replacement method for biomedical entities is conducive to biomedical relation extraction, especially to document-level relation extraction.

Highlights

A large number of biomedical entity relations exist in the biomedical literature
Through experiments of biomedical entity pretreatments, we found that a model using an entity replacement method can improve performance
Experimenting on different datasets, including 2 sentence-level corpora and a document-level corpus, we compare various biomedical entity pretreatments and analyze which preprocessing is better for the self-attention structure

Summary

Introduction

A large number of biomedical entity relations exist in the biomedical literature. It is beneficial for the development of biomedical fields to automatically and accurately extract these relations and form structured knowledge. Some biomedical datasets have been proposed for extracting biomedical relations, such as drug-drug interactions (DDI) [1], chemical-protein relations (CPR) [2], and chemical-induced diseases (CID) [3] The former 2 datasets are sentence-level annotated datasets that http://medinform.jmir.org/2020/5/e17644/ XSLFO RenderX. To deal with long and complicated sentences, Sun et al [5] separated sequences into short context subsequences and proposed a hierarchical recurrent convolutional neural network (CNN) Because these approaches cannot be directly applied to document-level datasets, some existing methods [8,9] divided the document-level dataset into 2 parts and trained an intrasentence model and an intersentence model. Some approaches have been proposed to extract relations by splitting the document-level datasets through heuristic rules and learning methods These approaches may introduce additional noise and do not really solve the problem of intersentence relation extraction. It is challenging to avoid noise and extract cross-sentence relations

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: JMIR medical informatics	Publication Date: May 29, 2020
Citations: 9	License type: cc-by

R Discovery Prime

Document-Level Biomedical Relation Extraction Leveraging Pretrained Self-Attention Structure and Entity Replacement: Algorithm and Pretreatment Method Validation Study.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: JMIR medical informatics

Lead the way for us

Similar Papers

JTIS: enhancing biomedical document-level relation extraction through joint training with intermediate steps.
Jiru Li ... Jian Wang
Database : the journal of biological databases and curation | VOL. 2024
Jiru Li, et. al.Jiru Li ... Jian Wang
19 Dec 2024
Database : the journal of biological databases and curation | VOL. 2024

Exploring semi-supervised variational autoencoders for biomedical relation extraction.
Yijia Zhang ... Zhiyong Lu
Methods (San Diego, Calif.) | VOL. 166
Yijia Zhang, et. al.Yijia Zhang ... Zhiyong Lu
27 Feb 2019
Methods (San Diego, Calif.) | VOL. 166

Document-Level Biomedical Relation Extraction with Generative Adversarial Network and Dual-Attention Multi-Instance Learning
Lishuang Li ... Ruiyuan Lian
-
Lishuang Li, et. al.Lishuang Li ... Ruiyuan Lian
09 Dec 2021
09 Dec 2021

Biomedical relation extraction via knowledge-enhanced reading comprehension
Jing Chen ... Buzhou Tang
BMC Bioinformatics | VOL. 23
Jing Chen, et. al.Jing Chen ... Buzhou Tang
06 Jan 2022
BMC Bioinformatics | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Document-Level Biomedical Relation Extraction Leveraging Pretrained Self-Attention Structure and Entity Replacement: Algorithm and Pretreatment Method Validation Study.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: JMIR medical informatics