Abstract

BackgroundPhylogenetic analyses of the bacterial genomes based on the simple classification in core- genes and accessory genes pools could offer an incomplete view of the evolutionary processes, of which some are still unresolved. A combined strategy based on stratified phylogeny and ancient molecular polymorphisms is proposed to infer detailed evolutionary reconstructions by using a large number of whole genomes. This strategy, based on the highest number of genomes available in public databases, was evaluated for improving knowledge of the ancient diversification of E. coli. This staggered evolutionary scenario was also used to investigate whether the diversification of the ancient E. coli lineages could be associated with particular lifestyles and adaptive strategies.ResultsPhylogenetic reconstructions, exploiting 6220 available genomes in Genbank, established the E. coli core genome in 1023 genes, representing about 20% of the complete genome. The combined strategy using stratified phylogeny plus molecular polymorphisms inferred three ancient lineages (D, EB1A and FGB2). Lineage D was the closest to E. coli root. A staggered diversification could also be proposed in EB1A and FGB2 lineages and the phylogroups into these lineages. Several molecular markers suggest that each lineage had different adaptive trajectories. The analysis of gained and lost genes in the main lineages showed that functions of carbohydrates utilization (uptake of and metabolism) were gained principally in EB1A lineage, whereas loss of environmental-adaptive functions in FGB2 lineage were observed, but this lineage showed higher accumulated mutations and ancient recombination events. The population structure of E. coli was re-evaluated including up to 7561 new sequenced genomes, showing a more complex population structure of E. coli, as a new phylogroup, phylogroup I, was proposed.ConclusionsA staggered reconstruction of E. coli phylogeny is proposed, indicating evolution from three ancestral lineages to reach all main known phylogroups. New phylogroups were confirmed, suggesting an increasingly complex population structure of E. coli. However these new phylogroups represent < 1% of the global E. coli population. A few key evolutionary forces have driven the diversification of the two main E. coli lineages, metabolic flexibility in one of them and colonization-virulence in the other.

Highlights

  • Phylogenetic analyses of the bacterial genomes based on the simple classification in core- genes and accessory genes pools could offer an incomplete view of the evolutionary processes, of which some are still unresolved

  • As many genes found in bacterial genomes were acquired at different evolutionary times, our proposal was based on elucidating the successive steps in the E. coli diversification following the combination of two analytic approaches

  • A phylogenetic tree was constructed with these 1023 genes, to define the reference phylogeny, identifying the E. coli phylogroups

Read more

Summary

Introduction

Phylogenetic analyses of the bacterial genomes based on the simple classification in core- genes and accessory genes pools could offer an incomplete view of the evolutionary processes, of which some are still unresolved. As many genes found in bacterial genomes were acquired at different evolutionary times, our proposal was based on elucidating the successive steps in the E. coli diversification following the combination of two analytic approaches. Phylogenetic reconstructions were carried out using genes sharing equivalent phylogenetic depth, representing the different evolutionary steps, from the most ancient to the most recent evolutionary ranks These evolutionary levels successively consider the minimal genome (designed bacteria-core genome), the genus-core genome, the species-core genome, the phylogroup-core genome and the subphylogroup-core genome. The remaining pool of genes, which was not assigned to these core genomes, was considered as the accessory genome (Fig. 1) This analysis, that we designated as “stratified phylogeny” (SP), permits phylogenetic reconstructions of the progressive diversification processes of E. coli and might identify trends that can occur over different evolutionary timescales in different lineages. When the ancestral reconstruction was resolved, the ancestral gains and losses of genes were studied, in order to understand the adaptive trajectories in the current lineages, and to provide some insights in the diversification drivers

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call