Abstract

In this article, we describe our system for the CHEMPROT task of the BioCreative VI challenge. Although considerable research on the named entity recognition of genes and drugs has been conducted, there is limited research on extracting relationships between them. Extracting relations between chemical compounds and genes from the literature is an important element in pharmacological and clinical research. The CHEMPROT task of BioCreative VI aims to promote the development of text mining systems that can be used to automatically extract relationships between chemical compounds and genes. We tested three recursive neural network approaches to improve the performance of relation extraction. In the BioCreative VI challenge, we developed a tree-Long Short-Term Memory networks (tree-LSTM) model with several additional features including a position feature and a subtree containment feature, and we also applied an ensemble method. After the challenge, we applied additional pre-processing steps to the tree-LSTM model, and we tested the performance of another recursive neural network model called Stack-augmented Parser Interpreter Neural Network (SPINN). Our tree-LSTM model achieved an F-score of 58.53% in the BioCreative VI challenge. Our tree-LSTM model with additional pre-processing and the SPINN model obtained F-scores of 63.7 and 64.1%, respectively.Database URL: https://github.com/arwhirang/recursive_chemprot

Highlights

  • There is an increasing interest to find relationships between biological and chemical entities in the literature and store the relationship information in the form of a structured knowledgebase

  • We introduce three Recursive Neural Network approaches that use the syntactical features of each node in a parse tree, exploiting the recursive structure of natural language sentences [3]

  • In our previous work [11], we extracted drug–drug interactions (DDIs) using tree-long short-term memory (LSTM) with position and subtree containment features for a DDI task [12] that involved extracting four relations that can occur between two drugs

Read more

Summary

Introduction

There is an increasing interest to find relationships between biological and chemical entities in the literature and store the relationship information in the form of a structured knowledgebase. An accurate and comprehensive knowledgebase can play an important role in many downstream applications in precision medicine. The gap between the information curated in the existing knowledgebases and the information available in the literature widens every day as the volume of biomedical literature rapidly increases. On average, >3000 papers are published daily in the biomedical domain alone. Manual curation approaches are still widely used to ensure the quality of the contents in knowledgebases, completely manual approaches are too VC The Author(s) 2018.

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.