A Bayesian Compound Stochastic Process for Modeling Nonstationary and Nonhomogeneous Sequence Evolution

S Blanquart

doi:10.1093/molbev/msl091

Abstract

Variations of nucleotidic composition affect phylogenetic inference conducted under stationary models of evolution. In particular, they may cause unrelated taxa sharing similar base composition to be grouped together in the resulting phylogeny. To address this problem, we developed a nonstationary and nonhomogeneous model accounting for compositional biases. Unlike previous nonstationary models, which are branchwise, that is, assume that base composition only changes at the nodes of the tree, in our model, the process of compositional drift is totally uncoupled from the speciation events. In addition, the total number of events of compositional drift distributed across the tree is directly inferred from the data. We implemented the method in a Bayesian framework, relying on Markov Chain Monte Carlo algorithms, and applied it to several nucleotidic data sets. In most cases, the stationarity assumption was rejected in favor of our nonstationary model. In addition, we show that our method is able to resolve a well-known artifact. By Bayes factor evaluation, we compared our model with 2 previously developed nonstationary models. We show that the coupling between speciations and compositional shifts inherent to branchwise models may lead to an overparameterization, resulting in a lesser fit. In some cases, this leads to incorrect conclusions, concerning the nature of the compositional biases. In contrast, our compound model more flexibly adapts its effective number of parameters to the data sets under investigation. Altogether, our results show that accounting for nonstationary sequence evolution may require more elaborate and more flexible models than those currently used.

Highlights

Base composition has been shown to be highly variable among species (Jukes and Bhushan 1986; Montero et al 1990; Bernardi 1993), a phenomenon generally denoted as compositional biases
We considered the data set of 5 16S rRNAs (T. thermophilus, D. radiodurans, B. subtilis, T. maritima, and A. pyrophilus) and run chains under the BPp model, fixing the topology to its correct s1 or to its artifact s2 configuration
The nonstationary model introduced here differs from previous full-likelihood–based models handling compositional bias phenomena (Yang and Roberts 1995; Galtier and Gouy 1998; Foster 2004) by allowing one to infer a free number of compositional shift events along lineages

Summary

Introduction

Base composition has been shown to be highly variable among species (Jukes and Bhushan 1986; Montero et al 1990; Bernardi 1993), a phenomenon generally denoted as compositional biases. The RY coding (Woese et al 1991) consists in replacing nucleotides A and G by R (purine) and C and T by Y (pyrimidine) In this way, only transversion events are considered, nucleotides A and G, C and T become synonymous and GC biases are removed. One can accommodate the data by removing saturated sites from the analysis such as third codon positions (Swofford et al 1996; Delsuc et al 2002; Canback et al 2004) or fast-evolving sites (Brinkmann and Philippe 1999; Philippe et al 2000) These methods have not been devised to deal with compositional biases, but assuming that biased sites are generally among

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Molecular Biology and Evolution	Publication Date: Aug 10, 2006
Citations: 139	License type: cc-by

R Discovery Prime

R Discovery Prime

A Bayesian Compound Stochastic Process for Modeling Nonstationary and Nonhomogeneous Sequence Evolution

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Molecular Biology and Evolution

Lead the way for us

Similar Papers

Estimation of a non-stationary model for annual precipitation in southern Norway using replicates of the spatial field
Rikke Ingebrigtsen ... Sara Martino
Spatial Statistics | VOL. 14
Rikke Ingebrigtsen, et. al.Rikke Ingebrigtsen ... Sara Martino
11 Jul 2015
Spatial Statistics | VOL. 14

Nonstationary flood and low flow frequency analysis in the upper reaches of Huaihe River Basin, China, using climatic variables and reservoir index as covariates
Menghao Wang ... Xiuqin Fang
Journal of Hydrology | VOL. 612
Menghao Wang, et. al.Menghao Wang ... Xiuqin Fang
29 Jul 2022
Journal of Hydrology | VOL. 612

Predictability of Coastal Extreme Wave Heights Based on a Nonstationary Hierarchical Bayesian Model: The Role of the Sea Surface Temperature
Yong-Tak Kim ... Moon Hyung Park
Journal of Coastal Research | VOL. 114
Yong-Tak Kim, et. al.Yong-Tak Kim ... Moon Hyung Park
06 Oct 2021
Journal of Coastal Research | VOL. 114

Bayesian Inference and RJMCMC in Structural Dynamics: On Experimental Data
D Tiboaca ... R J Barthorpe
-
D Tiboaca, et. al.D Tiboaca ... R J Barthorpe
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Bayesian Compound Stochastic Process for Modeling Nonstationary and Nonhomogeneous Sequence Evolution

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Molecular Biology and Evolution