Tree Compatibility, Incomplete Directed Perfect Phylogeny, and Dynamic Graph Connectivity: An Experimental Study

David Fernández-Baca,Lei Liu

doi:10.3390/a12030053

Abstract

We study two problems in computational phylogenetics. The first is tree compatibility. The input is a collection of phylogenetic trees over different partially-overlapping sets of species. The goal is to find a single phylogenetic tree that displays all the evolutionary relationships implied by . The second problem is incomplete directed perfect phylogeny (IDPP). The input is a data matrix describing a collection of species by a set of characters, where some of the information is missing. The question is whether there exists a way to fill in the missing information so that the resulting matrix can be explained by a phylogenetic tree satisfying certain conditions. We explain the connection between tree compatibility and IDPP and show that a recent tree compatibility algorithm is effectively a generalization of an earlier IDPP algorithm. Both algorithms rely heavily on maintaining the connected components of a graph under a sequence of edge and vertex deletions, for which they use the dynamic connectivity data structure of Holm et al., known as HDT. We present a computational study of algorithms for tree compatibility and IDPP. We show experimentally that substituting HDT by a much simpler data structure—essentially, a single-level version of HDT—improves the performance of both of these algorithm in practice. We give partial empirical and theoretical justifications for this observation.

Highlights

A phylogenetic tree is a graphical depiction of the evolutionary history of a collection of taxa
The problem is to find a tree T whose taxon set is the union of the taxon sets of the input trees, such that each input tree Ti can be obtained from the restriction of T
BuildNT is closely related to Semple and Steel’s version of B UILD [2]

Summary

Introduction

A phylogenetic tree is a graphical depiction of the evolutionary history of a collection of taxa (typically species or genes). The problem is to find a tree T whose taxon set is the union of the taxon sets of the input trees, such that each input tree Ti can be obtained from the restriction of T to the leaf set of Ti through edge contraction. If such a tree T exists, P is said to be compatible; otherwise, P is incompatible. Since a profile of rooted trees is effectively a collection of unrooted trees that have a common root taxon, the preceding observation establishes the connection between rooted tree compatibility and IDPP. Our empirical results show that, in this setting, simple data structures perform better than more sophisticated ones with better asymptotic bounds

Background

Contributions

Contents

Graphs and Phylogenetic Trees

Spanning Forests and Euler Tour Trees

Edge Deletion in HDT

Level Truncation

Tree Compatibility

The Display Graph

Incomplete Directed Perfect Phylogeny

The Relationship between Tree Compatibility and IDPP

Experiments with Tree Compatibility

Real Datasets

Generating Simulated Data

Impact of Level Truncation

Worst-Case Time versus Empirically-Observed Time

Performance on Profiles of More General Phylogenetic Trees

Connectivity Testing versus Maintaining Semi-Universal Labels

Experiments with IDPP

Simulated Datasets

Solving IDPP via Tree Compatibility

Analysis

The Impact of Deleting Non-Tree Edges

The Number of Edges Scanned

The Size of the Smaller Component

Findings

Discussion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms	Publication Date: Feb 28, 2019
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Tree Compatibility, Incomplete Directed Perfect Phylogeny, and Dynamic Graph Connectivity: An Experimental Study

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms

Lead the way for us

Similar Papers

On compatibility and incompatibility of collections of unrooted phylogenetic trees
David Fernández-Baca ... Sudheer R Vakati
Discrete Applied Mathematics | VOL. 245
David Fernández-Baca, et. al.David Fernández-Baca ... Sudheer R Vakati
30 May 2017
Discrete Applied Mathematics | VOL. 245

Perfect Matching for Biconnected Cubic Graphs in O(n log2 n) Time
Krzysztof Diks ... Piotr Stanczyk
-
Krzysztof Diks, et. al.Krzysztof Diks ... Piotr Stanczyk
01 Jan 2009
01 Jan 2009

Enumerating all maximal frequent subtrees in collections of phylogenetic trees.
David Fernández-Baca ... Akshay Deepak
Algorithms for molecular biology : AMB | VOL. 9
David Fernández-Baca, et. al.David Fernández-Baca ... Akshay Deepak
18 Jun 2014
Algorithms for molecular biology : AMB | VOL. 9

Constructing liberal and conservative supertrees and exact solutions for reduced consensus problems
Jianrong Dong
-
Jianrong DongJianrong Dong
31 Oct 2012
31 Oct 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Tree Compatibility, Incomplete Directed Perfect Phylogeny, and Dynamic Graph Connectivity: An Experimental Study

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms