Abstract

In this chapter, we give a not-so-long and self-contained introduction to computational molecular evolution. In particular, we present the emergence of the use of likelihood-based methods, review the standard DNA substitution models, and introduce how model choice operates. We also present recent developments in inferring absolute divergence times and rates on a phylogeny, before showing how state-of-the-art models take inspiration from diffusion theory to link population genetics, which traditionally focuses at a taxonomic level below that of the species, and molecular evolution. Although this is not a cookbook chapter, we try and point to popular programs and implementations along the way.

Highlights

  • Many books [1–7] and review papers [8–10] have been published in recent years on the topic of computational molecular evolution, so that updating our previous primer on the very same topic [11] may seem redundant

  • (model), so that we argue that the frequentist vs. Bayesian controversy is sterile, and we advocate a more pragmatic approach, that often results in the mixing of both approaches [81, 82]

  • Most of the initial applications of likelihood-based methods were motivated by the shortcomings of parsimony, they have become well accepted as they constitute principled inference approaches that rely on probabilistic logic

Read more

Summary

Introduction

Many books [1–7] and review papers [8–10] have been published in recent years on the topic of computational molecular evolution, so that updating our previous primer on the very same topic [11] may seem redundant. The field is continuously undergoing changes, as both models and algorithms become even more sophisticated, efficient, robust, and accurate. This increase in refinement has not been motivated by a desire to complicate existing models, but rather to make an old wish come true: that of having integrated methods that can take unaligned sequences as an input, and simultaneously output the alignment, the tree, and other estimates of interest, in a sound statistical framework justified by sound principles: those of population genetics. The aim of this chapter is still to provide readers with the essentials of computational molecular evolution, offering a brief overview of recent progress, both in terms of modeling and algorithm development. Genomic-scale data is briefly touched upon, but the details are left to other chapters

A Brief Overview of Parsimony
Assessing the Reliability of an Estimate
Parsimony and LBA
Modeling Molecular
A AG μGT μGC μGA ÀμG
Computation on a Tree
Substitution Models and Instantaneous Rate Matrices Q
Optimization of the Likelihood Function
Selection of the Appropriate Substitution Model
The Likelihood Ratio Test θ
InformationTheoretic
Cross-Validation
2.10.1 Counting Trees
2.10.2 Some Heuristics to Find the Best Tree
Dating the Tree of Life
The Strict Molecular Clock
Correlated Relaxed
Uncorrelated Relaxed Clocks
Some Applications of Relaxed Clock Models
Molecular Population Phylogenomics
Origin of Mutation–Selection Models
Fixation Probabilities
The Case of Genic
Parallelization
HPC and Cloud Computing
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call