Abstract
Substitution-rate variation among sites and differences in the probabilities of change among the four nucleotides are conflated in DNA sequence comparisons. When variation in rate exists among sites but is ignored, biases in the rates of change among nucleotides are underestimated. This paper provides a quantification of this effect when the observed proportions of transitions, P, and transversions, Q, between two sequences are used to estimate transition bias. The utility of P/Q as an estimator is examined both with and without rate variation among sites. A gamma-distributed-rates model is used to illustrate the effect that variation among sites has on estimates of transition bias, but it is argued that the basic results should hold for any pattern of rate variation. Naive estimates of the extent of transition bias, those that ignore rate variation when it is present, can seriously underestimate its true value. The extent of this underestimation increases with the amount of rate variation among sites. An example using human mitochondrial DNA shows that a simple comparison of the proportions of transitions and transversions in recently diverged sequences underestimates the level of transition bias by approximately 15%. This does not depend on the use of P/Q to estimate transition bias; maximum-likelihood methods give similar results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.