Accuracy Guarantees for Phylogeny Reconstruction Algorithms Based on Balanced Minimum Evolution

Magnus Bordewich,Radu Mihaescu

doi:10.1109/tcbb.2013.39

Abstract

Distance-based phylogenetic methods attempt to reconstruct an accurate phylogenetic tree from an estimated matrix of pairwise distances between taxa. This paper examines two distance-based algorithms (GreedyBME and FastME) that are based on the principle of minimizing the balanced minimum evolution score of the output tree in relation to the given estimated distance matrix. This is also the principle that underlies the neighbor-joining (NJ) algorithm. We show that GreedyBME and FastME both reconstruct the entire correct tree if the input data are quartet consistent, and also that if the maximum error of any distance estimate is epsilon, then both algorithms output trees containing all sufficiently long edges of the true tree: those having length at least 3epsilon. That is to say, the algorithms have edge safety radius 1/3. In contrast, quartet consistency of the data is not sufficient to guarantee the NJ algorithm reconstructs the correct tree, and moreover, the NJ algorithm has edge safety radius of 1/4: Only edges of the true tree of length at least 4epsilon can be guaranteed to appear in the output. These results give further theoretical support to the experimental evidence suggesting FastME is a more suitable distance-based phylogeny reconstruction method than the NJ algorithm.

Full Text