Abstract

The evolutionary history of a protein reflects the functional history of its ancestors. Recent phylogenetic studies identified distinct evolutionary signatures that characterize proteins involved in cancer, Mendelian disease, and different ontogenic stages. Despite the potential to yield insight into the cellular functions and interactions of proteins, such comparative phylogenetic analyses are rarely performed, because they require custom algorithms. We developed ProteinHistorian to make tools for performing analyses of protein origins widely available. Given a list of proteins of interest, ProteinHistorian estimates the phylogenetic age of each protein, quantifies enrichment for proteins of specific ages, and compares variation in protein age with other protein attributes. ProteinHistorian allows flexibility in the definition of protein age by including several algorithms for estimating ages from different databases of evolutionary relationships. We illustrate the use of ProteinHistorian with three example analyses. First, we demonstrate that proteins with high expression in human, compared to chimpanzee and rhesus macaque, are significantly younger than those with human-specific low expression. Next, we show that human proteins with annotated regulatory functions are significantly younger than proteins with catalytic functions. Finally, we compare protein length and age in many eukaryotic species and, as expected from previous studies, find a positive, though often weak, correlation between protein age and length. ProteinHistorian is available through a web server with an intuitive interface and as a set of command line tools; this allows biologists and bioinformaticians alike to integrate these approaches into their analysis pipelines. ProteinHistorian's modular, extensible design facilitates the integration of new datasets and algorithms. The ProteinHistorian web server, source code, and pre-computed ages for 32 eukaryotic genomes are freely available under the GNU public license at http://lighthouse.ucsf.edu/ProteinHistorian/.

Highlights

  • The proteins present in a species arose at a range of evolutionary times, and the context of a protein’s origin can provide information about its cellular functions and interactions [1,2]

  • An early prototype of the ProteinHistorian tool was used in a recent investigation of the evolutionary origins of the sirtuins, a protein family that contains several histone deacetylases

  • We present ProteinHistorian —an integrated web server, database, and set of command line tools for carrying out eukaryotic protein age analyses in a simple, intuitive pipeline

Read more

Summary

Introduction

The proteins present in a species arose at a range of evolutionary times, and the context of a protein’s origin can provide information about its cellular functions and interactions [1,2]. ‘‘age’’ may be appropriate in different contexts, ProteinHistorian offers several strategies for estimating ages from phylogenetic patterns that make use of different ancestral family reconstruction algorithms [15,16] and pre-existing databases of evolutionary relationships [17,18,19]. To illustrate the use of ProteinHistorian, we describe the computation of ages for all proteins in 32 eukaryotic species using two ancestral family reconstruction algorithms and several different evolutionary databases.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call