Abstract

Recently it was shown that, if the subtree and chain reduction rules have been applied exhaustively to two unrooted phylogenetic trees, the reduced trees will have at most 15k-9 taxa where k is the TBR (Tree Bisection and Reconnection) distance between the two trees, and that this bound is tight. Here, we propose five new reduction rules and show that these further reduce the bound to 11k-9. The new rules combine the “unrooted generator” approach introduced in Kelk and Linz (SIAM J Discrete Math 33(3):1556–1574, 2019) with a careful analysis of agreement forests to identify (i) situations when chains of length 3 can be further shortened without reducing the TBR distance, and (ii) situations when small subtrees can be identified whose deletion is guaranteed to reduce the TBR distance by 1. To the best of our knowledge these are the first reduction rules that strictly enhance the reductive power of the subtree and chain reduction rules.

Highlights

  • A phylogenetic tree is a tree whose leaves are bijectively labelled by a set of species X [13]

  • We focus on the Tree Bisection and Reconnection (TBR) distance, which is NP-hard to compute [1,10]

  • While we do not go into detail about justifying that Tk and Tk provide a tight example, i.e. dTBR(Tk, Tk) = k, we point the interested reader to [12, Section 4], where a very similar family of constructions is given to show that the kernel result presented in [12] is tight for phylogenetic trees that are subtree and chain reduced, and do not contain any common so-called cluster

Read more

Summary

Introduction

A phylogenetic tree is a tree whose leaves are bijectively labelled by a set of species (or, more generically, a set of taxa) X [13]. The authors proved that the two polynomial-time subtree and chain reduction rules preserve the TBR distance and reduce the number of taxa to at most 28 · dTBR(T, T ) for any two unrooted phylogenetic trees T and T. The fact that chains are preserved allows us to determine specific situations when it is safe to reduce a chain to length 2 (and sometimes to length 1), or even to identify an entire component of an optimal agreement forest (which can be deleted, reducing the TBR distance by exactly 1) These insights directly inspire the new reduction rules presented in this article. We conclude with a short reflection on potential avenues for further improving the 11 · dTBR(T, T ) − 9 bound, and discuss a number of insights flowing from our analysis which might be useful when considering non-kernelization approaches for computing the TBR distance

Preliminaries
A New Suite of Reduction Rules
A New Kernel for Computing the TBR Distance
Tightness of the Kernel Under the New Reductions
Discussion and Future
A Proof of Theorem 5
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call