Abstract

AbstractA multiply-labeled tree (or MUL-tree) is a rooted tree in which every leaf is labeled by an element from some set \({\mathcal X}\), but in which more than one leaf may be labeled by the same element of \({\mathcal X}\). MUL-trees have applications in many fields. In phylogenetics, they can represent the evolution of gene families, where genes are represented by the species they belong to, the non-uniqueness of leaf-labels coming from the fact that a given genome may contain many paralogous genes. In this paper, we consider two problems related to the leaf-pruning (leaf removal) of MUL-trees leading to single-labeled trees. First, given a set of MUL-trees, the MUL-tree Set Pruning for Consistency (MULSETPC) Problem asks for a pruning of each tree leading to a set of consistent trees, i.e. a collection of label-isomorphic single-labeled trees. Second, processing each gene tree at a time, the MUL-tree Pruning for Reconciliation (MULPR) Problem asks for a pruning minimizing a reconciliation cost with a given species tree. We show that MULTSETPC is NP-hard and that MULPR is W[2]-hard when parameterized by the duplication cost. We then develop a polynomial-time heuristic for MULPR and show its accuracy by comparing it to a brute-force exact method on a set of gene trees from the Ensembl Genome Browser.KeywordsMultilabeled treePhylogenyGene treeDuplicationReconciliation

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.