Objective: biochemical function.

Brian P Anton,Martin Steffen,Richard J Roberts,Simon Kasif

doi:10.3389/fgene.2014.00210

Brian P Anton, Martin Steffen + Show 2 more

Open Access

https://doi.org/10.3389/fgene.2014.00210

Copy DOI

Abstract

DNA sequencing enables the discovery of new genes in high-throughput, low-cost experiments. Conversely, gene function is determined by low-throughput, high-cost experiments. This inverse relationship for these two types of data is a major impediment in meeting one of the major scientific challenges of our time—the understanding of genomes. This mismatch in throughput is illustrated by considering the progress made for one of the earliest sequenced genomes, that of Mycobacterium tuberculosis H37Rv (Mtb). When its genome was published in 1998, more than a quarter of its genes had no known function (Cole et al., 1998). Our lack of knowledge about these approximately 1000 “conserved hypothetical” genes in Mtb represents a serious deficiency in our understanding of its biology. Now, after more than a decade of progress, our knowledge of those proteins' functions is essentially unchanged—there are still greater than 900 genes with no known function (Lew et al., 2011). In contrast, during this same period, the scientific community has sequenced approximately 18,000 new genomes (Pagani et al., 2012), containing millions of new hypothetical proteins. Apparently, the vector of our progress has tipped decisively away from data interpretation and comprehension, and toward mere data collection. To address the issue of gene function testing and annotation for all microbes, we founded COMBREX (COMputational BRidge to EXperiments), an endeavor aimed at accelerating the rate of gene function validation (Anton et al., 2013). Two of COMBREX's more prominent initiatives were the creation of a comprehensive database for protein function data (http://combrex.bu.edu), and the deployment of a crowdsourcing platform to catalyze protein function experimentation. In the course of these two efforts, it became apparent that fundamental changes in approaches to the problem of protein function determination were needed if there was any hope of keeping pace with DNA sequencing. We suggest that the community work together to (1) re-establish the connection between existing gene annotation and the foundational experimental data that supports all annotation, (2) develop experiment design principles to help guide the identification of maximally informative targets for function validation, (3) invest in the development of higher-throughput approaches for the testing of protein function, and (4) provide an expedited publication pathway for reporting experimental results of gene function, analogous to the reporting of newly sequenced genomes in the journal “Standards in Genomic Sciences.”

Highlights

We suggest that the community work together to (1) reestablish the connection between existing gene annotation and the foundational experimental data that supports all annotation, (2) develop experiment design principles to help guide the identification of maximally informative targets for function validation, (3) invest in the development of higher-throughput approaches for the testing of protein function, and (4) provide an expedited publication pathway for reporting experimental results of gene function, analogous to the reporting of newly sequenced genomes in the journal “Standards in Genomic Sciences.”
We have recently developed a workflow for the characterization of hypothetical proteins and applied it to six proteins from H. pylori (Choi et al, 2013)
SUMMARY There needs to be a paradigm shift in the approach taken to determine and assign gene function if there is to be any hope of realizing the potential benefits from the torrent of new genome sequences

Summary

Introduction

We suggest that the community work together to (1) reestablish the connection between existing gene annotation and the foundational experimental data that supports all annotation, (2) develop experiment design principles to help guide the identification of maximally informative targets for function validation, (3) invest in the development of higher-throughput approaches for the testing of protein function, and (4) provide an expedited publication pathway for reporting experimental results of gene function, analogous to the reporting of newly sequenced genomes in the journal “Standards in Genomic Sciences.” For 3.3 million identified genes, we can currently document experimentally determined functions for just 0.4% of the proteins (13,665 proteins).

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in genetics	Publication Date: Jul 8, 2014
Citations: 12	License type: cc-by

R Discovery Prime

R Discovery Prime

Objective: biochemical function.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in genetics

Lead the way for us

Similar Papers

Standards in Genomic Sciences: New beginnings to reflect the association between the journal and BMC.
George M Garrity
Standards in Genomic Sciences | VOL. 9
George M GarrityGeorge M Garrity
01 Jul 2014
Standards in Genomic Sciences | VOL. 9

Distinguishing between biochemical and cellular function: Are there peptide signatures for cellular function of proteins?
Shruti Jain ... Rachit Bakshi
Proteins | VOL. 85
Shruti Jain, et. al.Shruti Jain ... Rachit Bakshi
06 Feb 2017
Proteins | VOL. 85

Cross-organism learning method to discover new gene functionalities
Giacomo Domeniconi ... Pietro Pinoli
Computer Methods and Programs in Biomedicine | VOL. 126
Giacomo Domeniconi, et. al.Giacomo Domeniconi ... Pietro Pinoli
17 Dec 2015
Computer Methods and Programs in Biomedicine | VOL. 126

Experience report
Youngik Yang ... Sun Kim
-
Youngik Yang, et. al.Youngik Yang ... Sun Kim
05 Oct 2009
05 Oct 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Objective: biochemical function.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in genetics