Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task.

Chih-Hsuan Wei,Yifan Peng,Allan Peter Davis,Carolyn J Mattingly,Thomas C Wiegers,Zhiyong Lu,Robert Leaman,Jiao Li

doi:10.1093/database/baw032

Abstract

Manually curating chemicals, diseases and their relationships is significantly important to biomedical research, but it is plagued by its high cost and the rapid growth of the biomedical literature. In recent years, there has been a growing interest in developing computational approaches for automatic chemical-disease relation (CDR) extraction. Despite these attempts, the lack of a comprehensive benchmarking dataset has limited the comparison of different techniques in order to assess and advance the current state-of-the-art. To this end, we organized a challenge task through BioCreative V to automatically extract CDRs from the literature. We designed two challenge tasks: disease named entity recognition (DNER) and chemical-induced disease (CID) relation extraction. To assist system development and assessment, we created a large annotated text corpus that consisted of human annotations of chemicals, diseases and their interactions from 1500 PubMed articles. 34 teams worldwide participated in the CDR task: 16 (DNER) and 18 (CID). The best systems achieved an F-score of 86.46% for the DNER task—a result that approaches the human inter-annotator agreement (0.8875)—and an F-score of 57.03% for the CID task, the highest results ever reported for such tasks. When combining team results via machine learning, the ensemble system was able to further improve over the best team results by achieving 88.89% and 62.80% in F-score for the DNER and CID task, respectively. Additionally, another novel aspect of our evaluation is to test each participating system’s ability to return real-time results: the average response time for each team’s DNER and CID web service systems were 5.6 and 9.3 s, respectively. Most teams used hybrid systems for their submissions based on machining learning. Given the level of participation and results, we found our task to be successful in engaging the text-mining research community, producing a large annotated corpus and improving the results of automatic disease recognition and CDR extraction.Database URL: http://www.biocreative.org/tasks/biocreative-v/track-3-cdr/

Highlights

Introduction and motivationChemicals, diseases and their relations are among the most searched topics by PubMed users worldwide [1, 2], reflecting their central roles in many areas of biomedical research and healthcare such as drug discovery and safety surveillance
A total of 25 teams submitted 34 systems for testing in the chemical-disease relations (CDR) task: 16 systems were tested in conjunction with the disease named entity recognition (DNER) task, and 18 for the chemicalinduced disease (CID) task
To determine the difficulty of DNER and CID tasks, we examined how many teams correctly identified each of the gold standard DNER concepts and CID relations in the test set

Summary

Introduction

Diseases and their relations are among the most searched topics by PubMed users worldwide [1, 2], reflecting their central roles in many areas of biomedical research and healthcare such as drug discovery and safety surveillance. The ultimate goal in drug discovery is to develop chemicals for therapeutics, recognition of adverse drug reactions (ADRs) between chemicals and diseases is important for improving chemical safety and toxicity studies and facilitating new screening assays for pharmaceutical compound survival. Manual annotation of such mechanistic and biomarker/ correlative chemical-disease relations (CDR) from unstructured free text into structured knowledge to facilitate identification of potential toxicity has been an important theme for several bioinformatics databases, such as the Comparative Toxicogenomics Database (CTD; http:// ctdbase.org/) [5]. NOTE: We consider the words ‘drug’ and ‘chemical’ to be interchangeable in this document

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Database	Publication Date: Jan 1, 2016
Citations: 164	License type: cc-by

R Discovery Prime

R Discovery Prime

Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Database

Lead the way for us

Similar Papers

Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction.
Hoang-Quynh Le ... Thanh Hai Dang
Database : the journal of biological databases and curation | VOL. 2016
Hoang-Quynh Le, et. al.Hoang-Quynh Le ... Thanh Hai Dang
01 Jul 2016
Database : the journal of biological databases and curation | VOL. 2016

Erratum: Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction.
Hoang-Quynh Le ... Thanh Hai Dang
Database : the journal of biological databases and curation | VOL. 2016
Hoang-Quynh Le, et. al.Hoang-Quynh Le ... Thanh Hai Dang
01 Jan 2015
Database : the journal of biological databases and curation | VOL. 2016

Chemical-induced disease relation extraction with dependency information and prior knowledge
Huiwei Zhou ... Yingyu Lin
Journal of Biomedical Informatics | VOL. 84
Huiwei Zhou, et. al.Huiwei Zhou ... Yingyu Lin
11 Jul 2018
Journal of Biomedical Informatics | VOL. 84

Knowledge Guided Attention and Graph Convolutional Networks for Chemical-Disease Relation Extraction.
Yi Sun ... Jian Wang
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 20
Yi Sun, et. al.Yi Sun ... Jian Wang
01 Jan 2023
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Database