Dating Phylogenies with Hybrid Local Molecular Clocks

Stéphane Aris-Brosou

doi:10.1371/journal.pone.0000879

Abstract

BackgroundBecause rates of evolution and species divergence times cannot be estimated directly from molecular data, all current dating methods require that specific assumptions be made before inferring any divergence time. These assumptions typically bear either on rates of molecular evolution (molecular clock hypothesis, local clocks models) or on both rates and times (penalized likelihood, Bayesian methods). However, most of these assumptions can affect estimated dates, oftentimes because they underestimate large amounts of rate change.Principal FindingsA significant modification to a recently proposed ad hoc rate-smoothing algorithm is described, in which local molecular clocks are automatically placed on a phylogeny. This modification makes use of hybrid approaches that borrow from recent theoretical developments in microarray data analysis. An ad hoc integration of phylogenetic uncertainty under these local clock models is also described. The performance and accuracy of the new methods are evaluated by reanalyzing three published data sets.ConclusionsIt is shown that the new maximum likelihood hybrid methods can perform better than penalized likelihood and almost as well as uncorrelated Bayesian models. However, the new methods still tend to underestimate the actual amount of rate change. This work demonstrates the difficulty of estimating divergence times using local molecular clocks.

Highlights

Estimating divergence times from molecular data is a special statistical endeavor, as the parameters of interest cannot be directly estimated from molecular sequences: only distances between pairs of sequences or site likelihood values can be estimated
It is advisable to use several methods to estimate a parameter of interest, such as divergence times between different species
The methods presented here constitute a significant improvement of the ad hoc ratesmoothing (AHRS) algorithm for the automatic placement of local molecular clocks by providing researchers with a means to determine how many clocks should be used to analyze their data

Summary

Introduction

Estimating divergence times from molecular data is a special statistical endeavor, as the parameters of interest cannot be directly estimated from molecular sequences: only distances between pairs of sequences or site likelihood values can be estimated Such distances are measured in terms of the expected number of changes per site along the molecule (DNA, RNA or protein). Because rates of evolution and species divergence times cannot be estimated directly from molecular data, all current dating methods require that specific assumptions be made before inferring any divergence time These assumptions typically bear either on rates of molecular evolution (molecular clock hypothesis, local clocks models) or on both rates and times (penalized likelihood, Bayesian methods). This work demonstrates the difficulty of estimating divergence times using local molecular clocks

Methods

Results

Conclusion