Abstract

Systematic biases such as long branch attraction can mislead commonly relied upon model-based (i.e. maximum likelihood and Bayesian) phylogenetic methods when, as is usually the case with empirical data, there is model misspecification. We present PhyQuart, a new method for evaluating the three possible binary trees for any quartet of taxa. PhyQuart was developed through a process of reciprocal illumination between a priori considerations and the results of extensive simulations. It is based on identification of site-patterns that can be considered to support a particular quartet tree taking into account the Hennigian distinction between apomorphic and plesiomorphic similarity, and employing corrections to the raw observed frequencies of site-patterns that exploit expectations from maximum likelihood estimation. We demonstrate through extensive simulation experiments that, whereas maximum likeilihood estimation performs well in many cases, it can be outperformed by PhyQuart in cases where it fails due to extreme branch length asymmetries producing long-branch attraction artefacts where there is only very minor model misspecification.

Highlights

  • Reconstructing what happened is a central task of any historical science [1]

  • The robustness of maximum likelihood (ML) to variation in evolutionary processes and the extent to which model misspecification results in systematic biases and statistical inconsistency are far from fully understood

  • We know that when evolutionary signal is eroded to the extent that is not, or is barely, distinguishable from confounding noise in the data, phylogenetic methods are more susceptible to yielding biased estimates [79]

Read more

Summary

Introduction

Reconstructing what happened is a central task of any historical science [1]. In biology, phylogenetic relationships are an important component of the history of life, some knowledge of which is a precondition of comparative methods [2]. The exclusion of complete long-branched groups might successfully reduce LBA, but is not helpful if the relationship of those taxa is of importance to the study in question Another frequently used strategy is the removal of sequence positions inferred to be fast evolving, e.g. We introduce PhyQuart, a new, quartet-based algorithm which considers two alternative directions of character evolution along the internal branch of a quartet tree to discern between potentially apomorphic and plesiomorphic split-supporting site-patterns, and ML to estimate the expected number of convergent split-supporting site-patterns. This combination of Hennigian logic and ML estimation represents a completely new strategy for the evaluation of sequence data. The PhyQuart algorithm is implemented in a command line driven software script

Concept
Algorithm
Software implementation
Performance
Elongation of two terminal branches
Elongation of one terminal branch
Elongation of three terminal branches
Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.