PhyPA: Phylogenetic method with pairwise sequence alignment outperforms likelihood methods in phylogenetics involving highly diverged sequences

Xuhua Xia

doi:10.1016/j.ympev.2016.07.001

Abstract

While pairwise sequence alignment (PSA) by dynamic programming is guaranteed to generate one of the optimal alignments, multiple sequence alignment (MSA) of highly divergent sequences often results in poorly aligned sequences, plaguing all subsequent phylogenetic analysis. One way to avoid this problem is to use only PSA to reconstruct phylogenetic trees, which can only be done with distance-based methods. I compared the accuracy of this new computational approach (named PhyPA for phylogenetics by pairwise alignment) against the maximum likelihood method using MSA (the ML+MSA approach), based on nucleotide, amino acid and codon sequences simulated with different topologies and tree lengths. I present a surprising discovery that the fast PhyPA method consistently outperforms the slow ML+MSA approach for highly diverged sequences even when all optimization options were turned on for the ML+MSA approach. Only when sequences are not highly diverged (i.e., when a reliable MSA can be obtained) does the ML+MSA approach outperforms PhyPA. The true topologies are always recovered by ML with the true alignment from the simulation. However, with MSA derived from alignment programs such as MAFFT or MUSCLE, the recovered topology consistently has higher likelihood than that for the true topology. Thus, the failure to recover the true topology by the ML+MSA is not because of insufficient search of tree space, but by the distortion of phylogenetic signal by MSA methods. I have implemented in DAMBE PhyPA and two approaches making use of multi-gene data sets to derive phylogenetic support for subtrees equivalent to resampling techniques such as bootstrapping and jackknifing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Molecular Phylogenetics and Evolution	Publication Date: Jul 1, 2016
Citations: 25	License type: cc-by

R Discovery Prime

R Discovery Prime

PhyPA: Phylogenetic method with pairwise sequence alignment outperforms likelihood methods in phylogenetics involving highly diverged sequences

Abstract

Talk to us

Similar Papers

More From: Molecular Phylogenetics and Evolution

Lead the way for us

Similar Papers

COFFEE: an objective function for multiple sequence alignments.
C Notredame ... L Holm
Bioinformatics | VOL. 14
C Notredame, et. al.C Notredame ... L Holm
01 Jun 1998
Bioinformatics | VOL. 14

MISHIMA - a new method for high speed multiple alignment of nucleotide sequences of bacterial genome scale data
Kirill Kryukov ... Naruya Saitou
BMC Bioinformatics | VOL. 11
Kirill Kryukov, et. al.Kirill Kryukov ... Naruya Saitou
18 Mar 2010
BMC Bioinformatics | VOL. 11

The construction and use of log-odds substitution scores for multiple sequence alignment.
Stephen F Altschul ... Elena Zaslavsky
PLoS Computational Biology | VOL. 6
Stephen F Altschul, et. al.Stephen F Altschul ... Elena Zaslavsky
15 Jul 2010
PLoS Computational Biology | VOL. 6

Characterization of pairwise and multiple sequence alignment errors
Giddy Landan ... Dan Graur
Gene | VOL. 441
Giddy Landan, et. al.Giddy Landan ... Dan Graur
03 Jun 2008
Gene | VOL. 441

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PhyPA: Phylogenetic method with pairwise sequence alignment outperforms likelihood methods in phylogenetics involving highly diverged sequences

Abstract

Talk to us

Similar Papers

More From: Molecular Phylogenetics and Evolution