Abstract
The relationship between speciation times and the corresponding times of gene divergence is of interest in phylogenetic inference as a means of understanding the past evolutionary dynamics of populations and of estimating the timing of speciation events. It has long been recognized that gene divergence times might substantially pre-date speciation events. Although the distribution of the difference between these has previously been studied for the case of two populations, this distribution has not been explicitly computed for larger species phylogenies. Here we derive a simple method for computing this distribution for trees of arbitrary size. A two-stage procedure is proposed which (i) considers the probability distribution of the time from the speciation event at the root of the species tree to the gene coalescent time conditionally on the number of gene lineages available at the root; and (ii) calculates the probability mass function for the number of gene lineages at the root. This two-stage approach dramatically simplifies numerical analysis, because in the first step the conditional distribution does not depend on an underlying species tree, while in the second step the pattern of gene coalescence prior to the species tree root is irrelevant. In addition, the algorithm provides intuition concerning the properties of the distribution with respect to the various features of the underlying species tree. The methodology is complemented by developing probabilistic formulae and software, written in R. The method and software are tested on five-taxon species trees with varying levels of symmetry. The examples demonstrate that more symmetric species trees tend to have larger mean coalescent times and are more likely to have a unimodal gamma-like distribution with a long right tail, while asymmetric trees tend to have smaller mean coalescent times with an exponential-like distribution. In addition, species trees with longer branches generally have shorter mean coalescent times, with branches closest to the root of the tree being most influential.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Statistical Applications in Genetics and Molecular Biology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.