Abstract

BackgroundBiomedical studies need assistance from automated tools and easily accessible data to address the problem of the rapidly accumulating literature. Text-mining tools and curated databases have been developed to address such needs and they can be applied to improve the understanding of molecular pathogenesis of complex diseases like thyroid cancer.ResultsWe have developed a system, PWTEES, which extracts pathway interactions from the literature utilizing an existing event extraction tool (TEES) and pathway named entity recognition (PathNER). We then applied the system on a thyroid cancer corpus and systematically extracted molecular interactions involving either genes or pathways. With the extracted information, we constructed a molecular interaction network taking genes and pathways as nodes. Using curated pathway information and network topological analyses, we highlight key genes and pathways involved in thyroid carcinogenesis.ConclusionsMining events involving genes and pathways from the literature and integrating curated pathway knowledge can help improve the understanding of molecular interactions of complex diseases. The system developed for this study can be applied in studies other than thyroid cancer. The source code is freely available online at https://github.com/chengkun-wu/PWTEES.

Highlights

  • Biomedical literature is a primary knowledge source for life science research, which facilitates the information and knowledge exchange through various biomedical studies

  • Evaluation of PWTEES For molecular events that do not involve pathways, PWTEES is equivalent to Turku Event Extraction System (TEES), which has already been thoroughly evaluated [38]

  • We focus on evaluating the performance of PWTEES on pathway events

Read more

Summary

Introduction

Biomedical literature is a primary knowledge source for life science research, which facilitates the information and knowledge exchange through various biomedical studies. In the past two decades, the annual increasing rate for the total citation count is around 4% [1]. This massive amount of available literature and its unstructured nature make it virtually impossible for researchers to keep track of all published results manually. Curated databases constitute another important source of knowledge for biomedical studies. Biomedical studies need assistance from automated tools and accessible data to address the problem of the rapidly accumulating literature. Text-mining tools and curated databases have been developed to address such needs and they can be applied to improve the understanding of molecular pathogenesis of complex diseases like thyroid cancer

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call