Abstract

BackgroundGenetic analyses of DNA sequences make use of an increasingly complex set of nucleotide substitution models to estimate the divergence between gene sequences. However, there is currently no way to assess the validity of nucleotide substitution models over short time-scales and with limited mutational accumulation.ResultsWe show that quantifying the decline in the ratio of transitions to transversions (ti/tv) over time provides an in-built measure of mutational saturation and hence of substitution model accuracy. We tested this through detailed phylogenetic analyses of 10 representative virus data sets comprising recently sampled and closely related sequences. In the majority of cases our estimates of ti/tv decrease with time, even under sophisticated time-reversible models of nucleotide substitution. This indicates that high levels of saturation are attained extremely rapidly in viruses, sometimes within decades. In contrast, we did not find any temporal patterns in selection pressures or CG-content over these short time-frames. To validate the temporal trend of ti/tv across a broader taxonomic range, we analyzed a set of 76 different viruses. Again, the estimate of ti/tv scaled negatively with evolutionary time, a trend that was more pronounced for rapidly-evolving RNA viruses than slowly-evolving DNA viruses.ConclusionsOur study shows that commonly used substitution models can underestimate the number of substitutions among closely related sequences, such that the time-scale of viral evolution and emergence may be systematically underestimated. In turn, estimates of ti/tv provide an effective internal control of substitution model performance in viruses because of their high sensitivity to mutational saturation.Electronic supplementary materialThe online version of this article (doi:10.1186/s12862-015-0312-6) contains supplementary material, which is available to authorized users.

Highlights

  • Genetic analyses of DNA sequences make use of an increasingly complex set of nucleotide substitution models to estimate the divergence between gene sequences

  • Estimates of mean evolutionary rates in the 10 viruses in our case study ranged across two orders of magnitude, from 2.39 × 10−5 nucleotide substitutions per site, per year for the complete Barley yellow dwarf virus (BYDV) (+ssRNA) data set to 2.35 × 10−3 subs/site/year for the reduced-age Cereal yellow dwarf virus (CYDV) (+ssRNA) data set (Table 1)

  • There was some overlap in the 95% credible intervals (CI) between the rates estimated from the reduced-age and the complete data sets, the reduced-age data sets had consistently higher mean rate estimates

Read more

Summary

Introduction

Genetic analyses of DNA sequences make use of an increasingly complex set of nucleotide substitution models to estimate the divergence between gene sequences. Estimating the number of nucleotide substitutions separating gene sequences is a fundamental task in evolutionary genetics and critical to the estimation of phylogenetic relationships and divergence times. Ti/tv decays with the divergence times of pairs of species when mutational saturation is not taken into account [9]. For this reason, it potentially provides a useful measure of substitution model accuracy

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call