Abstract

Although recombination is accepted to be common in bacteria, for many species robust phylogenies with well-resolved branches can be reconstructed from whole genome alignments of strains, and these are generally interpreted to reflect clonal relationships. Using new methods based on the statistics of single-nucleotide polymorphism (SNP) splits, we show that this interpretation is incorrect. For many species, each locus has recombined many times along its line of descent, and instead of many loci supporting a common phylogeny, the phylogeny changes many thousands of times along the genome alignment. Analysis of the patterns of allele sharing among strains shows that bacterial populations cannot be approximated as either clonal or freely recombining but are structured such that recombination rates between lineages vary over several orders of magnitude, with a unique pattern of rates for each lineage. Thus, rather than reflecting clonal ancestry, whole genome phylogenies reflect distributions of recombination rates.

Highlights

  • The only illustration that appears in Darwin’s Origin of Species (Darwin, 1859) is of a phylogenetic tree

  • Applying the methods and statistics that we developed for E. coli to a set of other bacterial species, that is Bacillus subtilis, Helicobacter pylori, Mycobacterium tuberculosis, Salmonella enterica, and Staphylococcus aureus, we show that, with the exception of M. tuberculosis where all strains are very closely related and most if not all DNA has been clonally inherited, all other species follow the same general behavior as E. coli

  • We focus on the SC1 collection of wild E. coli isolates that were collected in 2003–2004 near the shore of the St

Read more

Summary

Introduction

The only illustration that appears in Darwin’s Origin of Species (Darwin, 1859) is of a phylogenetic tree. The study of biological evolution in some sense corresponds to the study of the structure of this giant cell-division tree. Virtually all models of evolutionary dynamics are formulated as occurring along the branches of a tree, and many mathematical and computational methods have been developed for their inference, see for example (Felsenstein, 1981; Page and Holmes, 1998). This strategy has been employed from the earliest days of sequence analysis (Zuckerkandl and Pauling, 1965) and is almost invariably applied in the analysis of microbial genome sequences, which is the main topic of this work

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.