Different evolutionary trends form the twilight zone of the bacterial pan-genome.

Gal Horesh,Stephanie Mcgimpsey,Florent Lassalle,Nicholas R Thomson,Jukka Corander,Alyce Taylor-Brown,Eva Heinz

doi:10.1099/mgen.0.000670

Abstract

The pan-genome is defined as the combined set of all genes in the gene pool of a species. Pan-genome analyses have been very useful in helping to understand different evolutionary dynamics of bacterial species: an open pan-genome often indicates a free-living lifestyle with metabolic versatility, while closed pan-genomes are linked to host-restricted, ecologically specialized bacteria. A detailed understanding of the species pan-genome has also been instrumental in tracking the phylodynamics of emerging drug resistance mechanisms and drug-resistant pathogens. However, current approaches to analyse a species’ pan-genome do not take the species population structure into account, nor do they account for the uneven sampling of different lineages, as is commonplace due to over-sampling of clinically relevant representatives. Here we present the application of a population structure-aware approach for classifying genes in a pan-genome based on within-species distribution. We demonstrate our approach on a collection of 7500 Escherichia coli genomes, one of the most-studied bacterial species and used as a model for an open pan-genome. We reveal clearly distinct groups of genes, clustered by different underlying evolutionary dynamics, and provide a more biologically informed and accurate description of the species’ pan-genome.

Highlights

Advances in whole genome sequencing in the last two decades and the ability to sequence multiple isolates of the same species have revealed that, often, only a small fraction of genes are shared by all species members
To demonstrate how one can refine a pan-g enome description while accounting for population structure, we used a recently published genome collection that includes over 7500 E. coli and Shigella sp. genomes isolated from human hosts, referred to as the Horesh collection [11]
The genomes in the Horesh collection were collated from publications and other public resources, representing the known diversity of the clinical E. coli isolate genomes available in public databases, and underwent quality-control steps to ensure a final set of high-quality genomes

Summary

Introduction

Advances in whole genome sequencing in the last two decades and the ability to sequence multiple isolates of the same species have revealed that, often, only a small fraction of genes are shared by all species members. Measuring gene frequencies across the whole dataset does not account for the population structure or biased sampling of the genomes in the dataset Such simple classification can be problematic when the population of interest consists of multiple deep-b ranching lineages that are unevenly represented in the collection. If 50 % of a genome collection is represented by one lineage that was heavily over-sampled compared to other lineages, and all isolates of that lineage have a particular gene which is absent in all other lineages, this gene will be defined as an ‘intermediate’ gene. Based on these definitions alone, it would not be differentiated from a gene that is found in all isolates of

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Microbial genomics	Publication Date: Sep 9, 2021
Citations: 24	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Different evolutionary trends form the twilight zone of the bacterial pan-genome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Microbial genomics

Lead the way for us

Similar Papers

Antimicrobial Resistance in the Intensive Care Unit: Mechanisms, Epidemiology, and Management of Specific Resistant Pathogens
Henry S Fraimow ... Constantine Tsigrelis
Critical Care Clinics | VOL. 27
Henry S Fraimow, et. al.Henry S Fraimow ... Constantine Tsigrelis
07 Dec 2010
Critical Care Clinics | VOL. 27

Analysis on distribution and drug resistance of pathogen caused community-onset bloodstream infection
...
Zhonghua wei zhong bing ji jiu yi xue | VOL. 31
, et. al. ...
01 Jan 2019
Zhonghua wei zhong bing ji jiu yi xue | VOL. 31

Emergence of Drug Resistance in Mycobacterium and Other Bacterial Pathogens: The Posttranslational Modification Perspective
Manu Kandpal ... Shilpa Jamwal
-
Manu Kandpal, et. al.Manu Kandpal ... Shilpa Jamwal
01 Jan 2017
01 Jan 2017

Infectious Complications in Severe Acute Pancreatitis: Pathogens, Drug Resistance, and Status of Nosocomial Infection in a University-Affiliated Teaching Hospital.
Hao Tian ... Xingda Wu
Digestive Diseases and Sciences | VOL. 65
Hao Tian, et. al.Hao Tian ... Xingda Wu
05 Nov 2019
Digestive Diseases and Sciences | VOL. 65

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Different evolutionary trends form the twilight zone of the bacterial pan-genome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Microbial genomics