Abstract

BackgroundThe increasing availability of whole genome sequences allows the gene or protein content of different organisms to be compared, leading to burgeoning interest in the relatively new subfield of pan-genomics. However, while several studies have analyzed protein content relationships in specific groups of bacteria, there has yet to be a study that provides a general characterization of protein content relationships in a broad range of bacteria.ResultsA variation on reciprocal BLAST hits was used to infer relationships among proteins in several groups of bacteria, and data regarding protein conservation and uniqueness in different bacterial genera are reported in terms of "core proteomes", "unique proteomes", and "singlets". We also analyzed the relationship between protein content similarity and the percent identity of the 16S rRNA gene in pairs of bacterial isolates from the same genus, and found that the strength of this relationship varied substantially depending on the genus, perhaps reflecting different rates of genome evolution and/or horizontal gene transfer. Finally, core proteomes and unique proteomes were used to study the proteomic cohesiveness of several bacterial species, revealing that some bacterial species had little cohesiveness in their protein content, with some having fewer proteins unique to that species than randomly-chosen sets of isolates from the same genus.ConclusionsThe results described in this study aid our understanding of protein content relationships in different bacterial groups, allowing us to make further inferences regarding genome-environment relationships, genome evolution, and the soundness of existing taxonomic classifications.

Highlights

  • The increasing availability of whole genome sequences allows the gene or protein content of different organisms to be compared, leading to burgeoning interest in the relatively new subfield of pan-genomics

  • It was found that different bacterial genera vary widely in core proteome size, unique proteome size, and the number of singlets that their isolates contain, and that these variables are explained only partly by differences in proteome size

  • We found that the relationship between protein content similarity and the percent identity of the 16S rRNA gene varied substantially in different genera, with a fairly strong association in a few genera and little or no association in most other genera

Read more

Summary

Introduction

The increasing availability of whole genome sequences allows the gene or protein content of different organisms to be compared, leading to burgeoning interest in the relatively new subfield of pan-genomics. Taxonomic analyses have been performed using a diverse and often arbitrary selection of morphological and phenotypic characteristics Today, these characteristics are generally considered unsuitable for generating reliable and consistent taxonomies for prokaryotes, as there is no rational basis for choosing which morphological or phenotypic properties should be examined. The. While 16S rRNA gene sequence analysis and MLSA have proven to be effective tools for phylogenetics, a major deficiency inherent in these techniques is that only a small amount of information is used to represent an entire organism. While 16S rRNA gene sequence analysis and MLSA have proven to be effective tools for phylogenetics, a major deficiency inherent in these techniques is that only a small amount of information is used to represent an entire organism This practice has largely been accepted due to the time and cost of genome sequencing. The accelerating pace of genome sequencing provides the opportunity to explore the use of entire genomes in analyzing evolutionary relationships

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call