Abstract
BackgroundInference of evolutionary trees using the maximum likelihood principle is NP-hard. Therefore, all practical methods rely on heuristics. The topological transformations often used in heuristics are Nearest Neighbor Interchange (NNI), Subtree Prune and Regraft (SPR) and Tree Bisection and Reconnection (TBR). However, these topological transformations often fall easily into local optima, since there are not many trees accessible in one step from any given tree. Another more exhaustive topological transformation is p-Edge Contraction and Refinement (p-ECR). However, due to its high computation complexity, p-ECR has rarely been used in practice.ResultsTo make the p-ECR move more efficient, this paper proposes a new method named p-ECRNJ. The main idea of p-ECRNJ is to use neighbor joining (NJ) to refine the unresolved nodes produced in p-ECR.ConclusionExperiments with real datasets show that p-ECRNJ can find better trees than the best known maximum likelihood methods so far and can efficiently improve local topological transforms in reasonable time.
Highlights
Inference of evolutionary trees using the maximum likelihood principle is NP-hard
ECRML is the heuristic base on p-ECRNJ and hill climbing as shown in Methods
ECRML+PHYML is the heuristic based on the combination of the p-ECRNJ move with Nearest Neighbor Interchange (NNI), where rounds of NNI and p-ECRNJ are alternated as follows
Summary
Inference of evolutionary trees using the maximum likelihood principle is NP-hard. all practical methods rely on heuristics. The topological transformations often used in heuristics are Nearest Neighbor Interchange (NNI), Subtree Prune and Regraft (SPR) and Tree Bisection and Reconnection (TBR) These topological transformations often fall into local optima, since there are not many trees accessible in one step from any given tree. A rich variety of tree reconstruction methods based on sequences have been developed, which fall into three categories, (a) maximum parsimony methods, (b) distance based methods and (c) approaches applying the maximum likelihood principle. The latter two are the most popular. Due to low computational time complexity and demonstrated topological accuracy for small data sets, NJ and its variants have been widely used
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have