Abstract

The functional repertoire of a cell is largely embodied in its proteome, the collection of proteins encoded in the genome of an organism. The molecular functions of proteins are the direct consequence of their structure and structure can be inferred from sequence using hidden Markov models of structural recognition. Here we analyze the functional annotation of protein domain structures in almost a thousand sequenced genomes, exploring the functional and structural diversity of proteomes. We find there is a remarkable conservation in the distribution of domains with respect to the molecular functions they perform in the three superkingdoms of life. In general, most of the protein repertoire is spent in functions related to metabolic processes but there are significant differences in the usage of domains for regulatory and extra-cellular processes both within and between superkingdoms. Our results support the hypotheses that the proteomes of superkingdom Eukarya evolved via genome expansion mechanisms that were directed towards innovating new domain architectures for regulatory and extra/intracellular process functions needed for example to maintain the integrity of multicellular structure or to interact with environmental biotic and abiotic factors (e.g., cell signaling and adhesion, immune responses, and toxin production). Proteomes of microbial superkingdoms Archaea and Bacteria retained fewer numbers of domains and maintained simple and smaller protein repertoires. Viruses appear to play an important role in the evolution of superkingdoms. We finally identify few genomic outliers that deviate significantly from the conserved functional design. These include Nanoarchaeum equitans, proteobacterial symbionts of insects with extremely reduced genomes, Tenericutes and Guillardia theta. These organisms spend most of their domains on information functions, including translation and transcription, rather than on metabolism and harbor a domain repertoire characteristic of parasitic organisms. In contrast, the functional repertoire of the proteomes of the Planctomycetes-Verrucomicrobia-Chlamydiae superphylum was no different than the rest of bacteria, failing to support claims of them representing a separate superkingdom. In turn, Protista and Bacteria shared similar functional distribution patterns suggesting an ancestral evolutionary link between these groups.

Highlights

  • Proteins are active components of molecular machinery that perform vital functions for cellular and organismal life [1,2]

  • We studied the molecular functions of 1,646 domains defined at the fold superfamilies (FSFs) level of structural abstraction (SCOP 1.73) that are present in the proteomes of a total of 965 organisms spanning the three superkingdoms

  • In order to explore whether the overall distribution of general functional categories differs in organisms belonging to the three superkingdoms, we analyzed proteomes at the species level and calculated both the percentage and actual number of FSFs corresponding to different functional repertoires (Figure 2)

Read more

Summary

Introduction

Proteins are active components of molecular machinery that perform vital functions for cellular and organismal life [1,2]. Nascent polypeptide chains are unfolded random coils but quickly undergo conformational changes to produce characteristic and functional folds These folds are three-dimensional (3D) structures that define the native state of proteins [3,4]. While molecular interactions between domains in mutidomain proteins play important roles in the evolution of protein repertoires [6], it is the domain structure that is maintained in proteins for long periods of evolutionary time [7,8,9]. This is in sharp contrast to amino acid sequence, which is highly variable. Protein domains are considered evolutionary units [7,10,11,12]

Classification of Domains
Assigning FSF Structures to Proteomes
Assigning Functional Categories to Protein Domains
General Patterns in the Distribution of FSF Domain Functions
Distribution of FSF Domain Functions in the Three Superkingdoms of Life
Effect of Organism Lifestyle
Analysis of Minor Functional Categories
Reliability of Functional Annotations and Conclusions of this Study
Data Retrieval
Statistical Analysis
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.