Abstract

In phylogenomic analysis the collection of trees with identical score (maximum likelihood or parsimony score) may hamper tree search algorithms. Such collections are coined phylogenetic terraces. For sparse supermatrices with a lot of missing data, the number of terraces and the number of trees on the terraces can be very large. If terraces are not taken into account, a lot of computation time might be unnecessarily spent to evaluate many trees that in fact have identical score. To save computation time during the tree search, it is worthwhile to quickly identify such cases. The score of a species tree is the sum of scores for all the so-called induced partition trees. Therefore, if the topological rearrangement applied to a species tree does not change the induced partition trees, the score of these partition trees is unchanged. Here, we provide the conditions under which the three most widely used topological rearrangements (nearest neighbor interchange, subtree pruning and regrafting, and tree bisection and reconnection) change the topologies of induced partition trees. During the tree search, these conditions allow us to quickly identify whether we can save computation time on the evaluation of newly encountered trees. We also introduce the concept of partial terraces and demonstrate that they occur more frequently than the original “full” terrace. Hence, partial terrace is the more important factor of timesaving compared to full terrace. Therefore, taking into account the above conditions and the partial terrace concept will help to speed up the tree search in phylogenomic inference.

Highlights

  • In phylogenomics, one aims to reconstruct a phylogenetic species tree from multiple genes

  • We have shown that it is advantageous to identify and account for full and partial terraces during the tree search in phylogenomics

  • If two trees belong to the same full or partial terrace, one needs to compute the objective function for the identical partition trees only once

Read more

Summary

INTRODUCTION

One aims to reconstruct a phylogenetic species tree from multiple genes. If the topological rearrangement does not change any of the induced partition trees, the two trees belong to the same terrace and a recomputation of objective function (maximum likelihood or maximum parsimony) used in the tree search is not necessary in order to evaluate a new tree. We first specify the conditions under which the topological rearrangements applied to the species tree change the corresponding induced partition trees. Using these conditions, one can quickly identify whether it is necessary to recompute the objective function for a given partition or not as a consequence of one of the three widely used rearrangements: nearest neighbor interchange (NNI), subtree pruning and regrafting (SPR) and tree bisection and reconnection (TBR) (Felsenstein, 2004). We discuss the additional practical advantages of using induced partition trees in the maximum likelihood framework

Basic definitions and notations
Topological rearrangement operations
CONSEQUENCES OF TOPOLOGICAL REARRANGEMENTS APPLIED TO A SPECIES TREE
Definition of partial terraces
Occurrence of partial terraces in real data
ADVANTAGES OF USING INDUCED PARTITION TREES IN MAXIMUM LIKELIHOOD INFERENCE
Findings
DISCUSSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.