Testing the agreement of trees with internal labels

David Fernández-Baca,Lei Liu

doi:10.1186/s13015-021-00201-9

David Fernández-Baca, Lei Liu

Open Access

PDF Available

https://doi.org/10.1186/s13015-021-00201-9

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundA semi-labeled tree is a tree where all leaves as well as, possibly, some internal nodes are labeled with taxa. Semi-labeled trees encompass ordinary phylogenetic trees and taxonomies. Suppose we are given a collection {mathcal {P}}= {{mathcal {T}}_1, {mathcal {T}}_2, ldots , {mathcal {T}}_k} of semi-labeled trees, called input trees, over partially overlapping sets of taxa. The agreement problem asks whether there exists a tree {mathcal {T}}, called an agreement tree, whose taxon set is the union of the taxon sets of the input trees such that the restriction of {mathcal {T}} to the taxon set of {mathcal {T}}_i is isomorphic to {mathcal {T}}_i, for each i in {1, 2, ldots , k}. The agreement problems is a special case of the supertree problem, the problem of synthesizing a collection of phylogenetic trees with partially overlapping taxon sets into a single supertree that represents the information in the input trees. An obstacle to building large phylogenetic supertrees is the limited amount of taxonomic overlap among the phylogenetic studies from which the input trees are obtained. Incorporating taxonomies into supertree analyses can alleviate this issue. ResultsWe give a {mathcal {O}}(n k (sum _{i in [k]} d_i + log ^2(nk))) algorithm for the agreement problem, where n is the total number of distinct taxa in {mathcal {P}}, k is the number of trees in {mathcal {P}}, and d_i is the maximum number of children of a node in {mathcal {T}}_i.ConclusionOur algorithm can aid in integrating taxonomies into supertree analyses. Our computational experience with the algorithm suggests that its performance in practice is much better than its worst-case bound indicates.

Highlights

A semi-labeled tree is a tree where all leaves as well as, possibly, some internal nodes are labeled with taxa
The question is whether there exists a tree T whose taxon set is the union of the taxon sets of the input trees such that Ti is isomorphic to the restriction of T to the taxon set of Ti, for each i ∈ {1, 2, . . . , k}
We study a generalization of the agreement problem, where the internal nodes of the input trees may be labeled

Summary

Background

Suppose S is a nice exposed subset in a valid position π and let A be any set in (S). Proof of Lemma 9 Suppose, on the contrary, that there exist at least two distinct maximal nice exposed subsets S, S′. Lines 5–11 of Decompose construct the maximal nice exposed subset by deleting bad labels from S and merging sets in Ŵ . Lemma 11 Let π be a valid position in a profile P and let S∗ be the maximal nice exposed subset in π. Each virtually connected component contains all the labels in precisely one of the sets of the collection Ŵ in the minimal good partition (S, Ŵ) of ChP (π init).

Discussion

Conclusions