Abstract

BackgroundThe availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof. Such attempts are particularly relevant because of the major role of horizontal gene transfer and lineage-specific gene loss, at least in the evolution of prokaryotes.ResultsFive largely independent approaches were employed to construct trees for completely sequenced bacterial and archaeal genomes: i) presence-absence of genomes in clusters of orthologous genes; ii) conservation of local gene order (gene pairs) among prokaryotic genomes; iii) parameters of identity distribution for probable orthologs; iv) analysis of concatenated alignments of ribosomal proteins; v) comparison of trees constructed for multiple protein families. All constructed trees support the separation of the two primary prokaryotic domains, bacteria and archaea, as well as some terminal bifurcations within the bacterial and archaeal domains. Beyond these obvious groupings, the trees made with different methods appeared to differ substantially in terms of the relative contributions of phylogenetic relationships and similarities in gene repertoires caused by similar life styles and horizontal gene transfer to the tree topology. The trees based on presence-absence of genomes in orthologous clusters and the trees based on conserved gene pairs appear to be strongly affected by gene loss and horizontal gene transfer. The trees based on identity distributions for orthologs and particularly the tree made of concatenated ribosomal protein sequences seemed to carry a stronger phylogenetic signal. The latter tree supported three potential high-level bacterial clades,: i) Chlamydia-Spirochetes, ii) Thermotogales-Aquificales (bacterial hyperthermophiles), and ii) Actinomycetes-Deinococcales-Cyanobacteria. The latter group also appeared to join the low-GC Gram-positive bacteria at a deeper tree node. These new groupings of bacteria were supported by the analysis of alternative topologies in the concatenated ribosomal protein tree using the Kishino-Hasegawa test and by a census of the topologies of 132 individual groups of orthologous proteins. Additionally, the results of this analysis put into question the sister-group relationship between the two major archaeal groups, Euryarchaeota and Crenarchaeota, and suggest instead that Euryarchaeota might be a paraphyletic group with respect to Crenarchaeota.ConclusionsWe conclude that, the extensive horizontal gene flow and lineage-specific gene loss notwithstanding, extension of phylogenetic analysis to the genome scale has the potential of uncovering deep evolutionary relationships between prokaryotic lineages.

Highlights

  • The availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof

  • Even relatively close species such as, for example, Escherichia coli and Haemophilus influenzae, two species of the γ-subdivision of Proteobacteria, retain very little conservation of gene order beyond the operon level, and essentially none is detectable among evolutionarily distant bacteria and ar chaea [15,16,18]

  • We compare the topologies produced with five, largely independent approaches to genome-tree building: i) presence-absence of genomes in Clusters of Orthologous Groups of proteins (COGs); ii) conservation of local gene order among prokaryotic genomes; iii) distribution of percent identity between apparent orthologs; iv) sequence conservation in concatenated alignments of ribosomal proteins; v) comparative analysis of multiple trees reconstructed for representative protein families

Read more

Summary

Results

To our knowledge, conserved gene pairs and distributions of identity level between orthologs have not been used previously as the basis for phylogenetic tree construction. Trees built with different cutoff values for symmetrical best hits, four different formulas for the evolutionary distance calculation (see Materials and Methods) and different parameters of the distributions showed essentially the same topology, with strong bootstrap support for most of the clades (Fig. 5 and data not shown). The above three approaches involve construction of genome trees "par excellence", i.e. based on integral characteristics of genomes (or, more precisely, gene sets) that are not directly related to more traditional, alignmentbased measures, which are usually employed for calculating evolutionary distances or for parsimony analysis These genome tree raise several interesting phylogenetic questions, for example, do spirochetes and chlamydia share a common ancestor, and are Euryarchaeota, a paraphyletic group with respect to the Crenarchaeota. A wide spread of topologies was observed, but the grouping that is observed in the concatenated ribosomal proteins tree was encountered most often, for example, for the spirochete-chlamydia cluster, the lead over other topologies was slim (Fig. 13,14,15)

Background
Discussion and Conclusions
Material and Methods
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.