Abstract
BackgroundA frequent event in the evolution of prokaryotic genomes is homologous recombination, where a foreign DNA stretch replaces a genomic region similar in sequence. Recombination can affect the relative position of two genomes in a phylogenetic reconstruction in two different ways: (i) one genome can recombine with a DNA stretch that is similar to the other genome, thereby reducing their pairwise sequence divergence; (ii) one genome can recombine with a DNA stretch from an outgroup genome, increasing the pairwise divergence. While several recombination-aware phylogenetic algorithms exist, many of these cannot account for both types of recombination; some algorithms can, but do so inefficiently. Moreover, many of them reconstruct the ancestral recombination graph (ARG) to help infer the genome tree, and require that a substantial portion of each genome has not been affected by recombination, a sometimes unrealistic assumption.MethodsHere, we propose a Coarse-Graining approach for Phylogenetic reconstruction (CGP), which is recombination-aware but forgoes ARG reconstruction. It accounts for the tendency of a higher effective recombination rate between genomes with a lower phylogenetic distance. It is applicable even if all genomic regions have experienced substantial amounts of recombination, and can be used on both nucleotide and amino acid sequences. CGP considers the local density of substitutions along pairwise genome alignments, fitting a model to the empirical distribution of substitution density to infer the pairwise coalescent time. Given all pairwise coalescent times, CGP reconstructs an ultrametric tree representing vertical inheritance.ResultsBased on simulations, we show that the proposed approach can reconstruct ultrametric trees with accurate topology, branch lengths, and root positioning. Applied to a set of E. coli strains, the reconstructed trees are most consistent with gene distributions when inferred from amino acid sequences, a data type that cannot be utilized by many alternative approaches.ConclusionsThe CGP algorithm is more accurate than alternative recombination-aware methods for ultrametric phylogenetic reconstructions.
Highlights
A frequent event in the evolution of prokaryotic genomes is homologous recombination, where a foreign DeoxyriboNucleic Acid (DNA) stretch replaces a genomic region similar in sequence
A coarse-graining approach to phylogenetic reconstruction Figure 1 gives a brief illustration on how the proposed Coarse-Graining approach for Phylogenetic reconstruction (CGP) algorithm fits the distribution of local single site polymorphisms (SSPs) density of the genome pairs to infer their phylogenetic tree, forgoing the reconstruction of ancestral recombination graph (ARG)
CGP is based on a mathematical model [5, 6] that quantitatively describes the evolution of genomic sequence divergence; this model is applicable to both nucleotide sequences and amino acid sequences, and does not assume low recombination rate
Summary
A frequent event in the evolution of prokaryotic genomes is homologous recombination, where a foreign DNA stretch replaces a genomic region similar in sequence. The transfer of DNA stretches from one prokaryotic genome to another— called horizontal gene transfer (HGT) or lateral gene transfer (LGT)—is a major driver of prokaryotic evolution [1]. It is caused by a variety of mechanisms, including transformation, transduction, conjugation, and gene transfer agents [2, 3]. A foreign DNA stretch that enters the prokaryotic cell and survives these host defenses may be incorporated into the host genome. The incoming stretch may be inserted directly into the host genome through non-homologous recombination
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have