For a set of binary unrooted subtrees generating all binary unrooted trees compatible with them, i.e. generating their stand, is one of the classical problems in phylogenetics. Here, we introduce Gentrius-an efficient algorithm to tackle this task. The algorithm has a direct application in practice. Namely, Gentrius generates phylogenetic terraces-topologically distinct, equally scoring trees due to missing data. Despite stand generation being computationally intractable, we showed on simulated and biological datasets that Gentrius generates stands with millions of trees in feasible time. We exemplify that depending on the distribution of missing data across species and loci and the inferred phylogeny, the number of equally optimal terrace trees varies tremendously. The strict consensus tree computed from them displays all the branches unaffected by the pattern of missing data. Thus, by solving the problem of stand generation, in practice Gentrius provides an important systematic assessment of phylogenetic trees inferred from incomplete data. Furthermore, Gentrius can aid theoretical research by fostering understanding of tree space structure imposed by missing data.
Read full abstract