We combine two fundamental optimization problems related to the construction of phylogenetic trees called maximum rooted triplets consistency and minimally resolved supertree into a new problem, which we call q-maximum rooted triplets consistency (q-MAXRTC). It takes as input a set R of rooted, binary phylogenetic trees with three leaves each and asks for a phylogenetic tree with exactly q internal nodes that contains the largest possible number of trees from R. We prove that q-MAXRTC is NP-hard to approximate within a constant, develop polynomial-time approximation algorithms for different values of q, and show experimentally that representing a phylogenetic tree by one having much fewer nodes typically does not destroy too much branching information. To demonstrate the algorithmic advantage of using trees with few internal nodes, we also propose a new algorithm for computing the rooted triplet distance that is faster than the existing algorithms when restricted to such trees.
Read full abstract