Abstract

The single rate codon model of non-synonymous substitution is ubiquitous in phylogenetic modeling. Indeed, the use of a non-synonymous to synonymous substitution rate ratio parameter has facilitated the interpretation of selection pressure on genomes. Although the single rate model has achieved wide acceptance, we argue that the assumption of a single rate of non-synonymous substitution is biologically unreasonable, given observed differences in substitution rates evident from empirical amino acid models. Some have attempted to incorporate amino acid substitution biases into models of codon evolution and have shown improved model performance versus the single rate model. Here, we show that the single rate model of non-synonymous substitution is easily outperformed by a model with multiple non-synonymous rate classes, yet in which amino acid substitution pairs are assigned randomly to these classes. We argue that, since the single rate model is so easy to improve upon, new codon models should not be validated entirely on the basis of improved model fit over this model. Rather, we should strive to both improve on the single rate model and to approximate the general time-reversible model of codon substitution, with as few parameters as possible, so as to reduce model over-fitting. We hint at how this can be achieved with a Genetic Algorithm approach in which rate classes are assigned on the basis of sequence information content.

Highlights

  • The inference of selection within protein coding genes has benefited greatly from both the development of a probabilistic framework for phylogenetics [1] and codon models

  • Each rate is shared within the class, a is the ‘‘average’’ of synonymous substitution rates for all possible onenucleotide substitutions that don’t change the amino acid, and b is its non-synonymous analog

  • The nearly universal modeling assumption that all non-synonymous substitutions occur at the same rate is contrary to evidence that residue exchangeabilities are dependent on the physicochemical properties of amino acids (e.g. [8])

Read more

Summary

Introduction

The inference of selection within protein coding genes has benefited greatly from both the development of a probabilistic framework for phylogenetics [1] and codon models (see [2,3] for recent reviews). We consider the class of time-reversible codon substitution models which allow a single nucleotide to be substituted instantaneously, with SR being the simplest and REV being the most general, and models such as LCAP occupying an intermediate range.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.