Abstract

Solving median tree problems is a classic approach for inferring species trees from a collection of discordant gene trees. Median tree problems are typically NP-hard and dealt with by local search heuristics. Unfortunately, such heuristics generally lack provable correctness and precision. Algorithmic advances addressing this uncertainty have led to exact dynamic programming formulations suitable to solve a well-studied group of median tree problems for smaller phylogenetic analyses. However, these formulations allow computing only very few optimal species trees out of possibly many such trees, and phylogenetic studies often require the analysis of all optimal solutions through their consensus tree. Here, we describe a significant algorithmic modification of the dynamic programming formulations that compute the cluster counts of all optimal species trees from which various types of consensus trees can be efficiently computed. Through experimental studies, we demonstrate that our parallel implementation of the modified dynamic programming formulation is more efficient than a previous implementation of the original formulation. Finally, we show that the parallel implementation can rapidly identify novel reassorted influenza A viruses potentially facilitating pandemic preparedness efforts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call