BioDEAL: community generation of biological annotations

Paul Breimyer,Nagiza F Samatova,Vinay Kumar,Nathan Green

doi:10.1186/1472-6947-9-s1-s5

Abstract

BackgroundPublication databases in biomedicine (e.g., PubMed, MEDLINE) are growing rapidly in size every year, as are public databases of experimental biological data and annotations derived from the data. Publications often contain evidence that confirm or disprove annotations, such as putative protein functions, however, it is increasingly difficult for biologists to identify and process published evidence due to the volume of papers and the lack of a systematic approach to associate published evidence with experimental data and annotations. Natural Language Processing (NLP) tools can help address the growing divide by providing automatic high-throughput detection of simple terms in publication text. However, NLP tools are not mature enough to identify complex terms, relationships, or events.ResultsIn this paper we present and extend BioDEAL, a community evidence annotation system that introduces a feedback loop into the database-publication cycle to allow scientists to connect data-driven biological concepts to publications.ConclusionBioDEAL may change the way biologists relate published evidence with experimental data. Instead of biologists or research groups searching and managing evidence independently, the community can collectively build and share this knowledge.

Highlights

Over the past decade, systems biology research has undergone two key transformations
The Publication panel contains the publication text, for example, from PubMed or MEDLINE; BioDEAL supports both PDF and text (HTML, PHP, etc.) documents. The former is typically a full publication identified by its Uniform Resource Locator (URL) on the journal web site, while the latter may be an abstract from PubMed
BioDEAL can present annotations generated by external projects such as BioCreAtIvE [9,18], whose overarching goal is to enhance abstracts with annotations

Summary

Introduction

Systems biology research has undergone two key transformations. Public databases of experimentally generated -omics data are increasing in number, size and diversity, along with annotations predicted from these data by computational tools. Such annotations may include the predicted protein functions as part of genome annotation pipelines, the predicted high resolution 3-dimensional structures of proteins from amino acid sequence information alone, the predicted protein-protein interactions and interaction networks derived from databases of yeast-2-hybrid, or mass spectrometry pull-down experiments. There are currently over 20 million scientific abstracts in MEDLINE, growing at 500,000 articles per year [1] Such articles often report the discovered evidence (e.g., mutagenesis experiments) for various hypotheses derived via mining these heterogeneous databases of publicly available data and annotations. NLP tools are not mature enough to identify complex terms, relationships, or events

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Informatics and Decision Making	Publication Date: Nov 3, 2009
Citations: 15	License type: cc-by

R Discovery Prime

R Discovery Prime

BioDEAL: community generation of biological annotations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making

Lead the way for us

Similar Papers

Natural Language Processing and the Promise of Big Data: Small Step Forward, but Many Miles to Go.
Thomas M Maddox ... Michael A Matheny
Circulation. Cardiovascular quality and outcomes | VOL. 8
Thomas M Maddox, et. al.Thomas M Maddox ... Michael A Matheny
18 Aug 2015
Circulation. Cardiovascular quality and outcomes | VOL. 8

Automatically Detecting Failures in Natural Language Processing Tools for Online Community Text
Albert Park ... Andrea L Hartzler
Journal of Medical Internet Research | VOL. 17
Albert Park, et. al.Albert Park ... Andrea L Hartzler
31 Aug 2015
Journal of Medical Internet Research | VOL. 17

The underlying potential of NLP for microcontroller programming education
André Rocha ... Armando Sousa
Computer Applications in Engineering Education | VOL. -
André Rocha, et. al.André Rocha ... Armando Sousa
14 Aug 2024
Computer Applications in Engineering Education | VOL. -

Diagnosis codes overestimate the burden of prostate cancer cases.
Tori Anglin-Foote ... Patrick Alba
Journal of Clinical Oncology | VOL. 40
Tori Anglin-Foote, et. al.Tori Anglin-Foote ... Patrick Alba
20 Feb 2022
Journal of Clinical Oncology | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

BioDEAL: community generation of biological annotations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making