Abstract

Given a collection tau of subsets of a finite set X, we say that tau is phylogenetically flexible if, for any collection R of rooted phylogenetic trees whose leaf sets comprise the collection tau , R is compatible (i.e. there is a rooted phylogenetic X-tree that displays each tree in R). We show that tau is phylogenetically flexible if and only if it satisfies a Hall-type inequality condition of being ‘slim’. Using submodularity arguments, we show that there is a polynomial-time algorithm for determining whether or not tau is slim. This ‘slim’ condition reduces to a simpler inequality in the case where all of the sets in tau have size 3, a property we call ‘thin’. Thin sets were recently shown to be equivalent to the existence of an (unrooted) tree for which the median function provides an injective mapping to its vertex set; we show here that the unrooted tree in this representation can always be chosen to be a caterpillar tree. We also characterise when a collection tau of subsets of size 2 is thin (in terms of the flexibility of total orders rather than phylogenies) and show that this holds if and only if an associated bipartite graph is a forest. The significance of our results for phylogenetics is in providing precise and efficiently verifiable conditions under which supertree methods that require consistent inputs of trees can be applied to any input trees on given subsets of species.

Highlights

  • In phylogenomics, biologists often encounter the following problem: Given a collection τ of different subsets of species, the corresponding phylogenetic trees—each one reconstructed from the genomic data available for the corresponding subset—cannot be consistently combined into a single phylogenetic tree for all the species

  • Given a collection τ of subsets of a finite set X, we say that τ is phylogenetically flexible if, for any collection R of rooted phylogenetic trees whose leaf sets comprise the collection τ, R is compatible

  • When the collection of subsets of species has sufficiently sparse overlap, any phylogenetic tree assignment for τ will lead to a set of trees that can be consistently combined into a parent tree

Read more

Summary

Introduction

Biologists often encounter the following problem: Given a collection τ of different subsets of species, the corresponding phylogenetic trees—each one reconstructed from the genomic data available for the corresponding subset—cannot be consistently combined into a single phylogenetic tree for all the species When this occurs, various heuristic and somewhat ad hoc ‘supertree’ methods (such as ‘matrix recoding with parsimony’) are often applied to provide some estimate of a parent tree (Felsenstein 2004). Our work is partly motivated by results from Dress and Steel (2009) where slim-type properties arise in a tree-based setting, but for a quite different question involving ‘median’ vertices. (An extension of this to sets of subset of X of size greater than 3 is described.) In this paper, we extend this result further by showing that the tree that provides this encoding can be chosen to have a particular special type of structure (a ‘caterpillar’). Throughout this paper, X will denote a fixed finite set

Thin Set Systems
Phylogenetic Trees and Flexible Sets
Characterisation Result
Median Characterisations
Polynomial-Time Algorithms for Thin and Slim
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call