Abstract

The evolution, habitat, and lifestyle of the cryptic clade II of Escherichia, which were first recovered at low frequency from non-human hosts and later from external environments, were poorly understood. Here, the genomes of selected strains were analyzed for preliminary indications of ecological differentiation within their population. We adopted the delta bitscore metrics to detect functional divergence of their orthologous genes and trained a random forest classifier to differentiate the genomes according to habitats (gastrointestinal vs external environment). Model was built with inclusion of other Escherichia genomes previously demonstrated to have exhibited genomic traits of adaptation to one of the habitats. Overall, gene degradation was more prominent in the gastrointestinal strains. The trained model correctly classified the genomes, identifying a set of predictor genes that were informative of habitat association. Functional divergence in many of these genes were reflective of ecological divergence. Accuracy of the trained model was confirmed by its correct prediction of the habitats of an independent set of strains with known habitat association. In summary, the cryptic clade II of Escherichia displayed genomic signatures that are consistent with divergent adaptation to gastrointestinal and external environments.

Highlights

  • Related bacterial lineages can have very different habitats and niches; ecological differentiation was reported between Vibrionaceae strains coexisting in coastal ocean [1], as well as between typical E. coli and environmental cryptic clades of Escherichia [2,3]

  • We first annotated the genomes and assessed the completeness of assembly in terms of gene contents using the Benchmarking Universal Single-Copy Orthologs (BUSCO) set of 781 universal single copy orthologs found among 216 species of the order Enterobacteriales

  • It is possible that the discovery of these strains from both environments reflected a lifestyle that is similar to cryptic clade V of Escherichia, which were able to colonize gastrointestinal tracts while retaining traits favoring their survival in external environments [34,36]

Read more

Summary

Introduction

Related bacterial lineages can have very different habitats and niches; ecological differentiation was reported between Vibrionaceae strains coexisting in coastal ocean [1], as well as between typical E. coli (host-associated) and environmental cryptic clades of Escherichia [2,3]. Identifying and characterizing bacterial populations with distinct ecological niches (ecotypes) has been fundamental to understand their ecology and evolution [9,10]. The bloom of “omics” analyses—genomics, transcriptomics, proteomics, phenomics, etc.—with many valuable insights attained from the ever-expanding collection of bacterial genome sequence data substantially benefited our understanding of ecological differentiation and niche adaptation of bacteria [3,8,11,12,13]. Attempts to gain further functional insight into the underlying biological mechanisms of niche adaptation among closely related bacteria has been greatly facilitated by artificial intelligence approaches such as machine learning, which added invaluable depth and possibility to the interpretation of massive and complex genomic data. Combining data across whole genome or proteome, prediction of phenotypes from genotypes, and identification of genetic signatures of niche adaptation in pathogenic bacteria were made possible by machine learning, which otherwise would be difficult if not impossible to execute with other methods [13,14,15,16]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.