Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost

Shinsuke Yamada,Hayato Yamana,Osamu Gotoh

doi:10.1186/1471-2105-7-524

Abstract

BackgroundMultiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. In the alignment of a family of protein sequences, global MSA algorithms perform better than local ones in many cases, while local ones perform better than global ones when some sequences have long insertions or deletions (indels) relative to others. Many recent leading MSA algorithms have incorporated pairwise alignment information obtained from a mixture of sources into their scoring system to improve accuracy of alignment containing long indels.ResultsWe propose a novel group-to-group sequence alignment algorithm that uses a piecewise linear gap cost. We developed a program called PRIME, which employs our proposed algorithm to optimize the well-defined sum-of-pairs score. PRIME stands for Profile-based Randomized Iteration MEthod. We evaluated PRIME and some recent MSA programs using BAliBASE version 3.0 and PREFAB version 4.0 benchmarks. The results of benchmark tests showed that PRIME can construct accurate alignments comparable to the most accurate programs currently available, including L-INS-i of MAFFT, ProbCons, and T-Coffee.ConclusionPRIME enables users to construct accurate alignments without having to employ pairwise alignment information. PRIME is available at .

Highlights

Multiple sequence alignment (MSA) is a useful tool in bioinformatics
Because only about 20% of the sequences in BAliBASE version 3.0 [23] used for the test are common to those in BAliBASE version 2.01, we do not think that these parameters are over-fitted against BAliBASE version 3.0
The group-to-group sequence alignment algorithm is the key to most heuristic MSA algorithms

Summary

Results

PRIME We developed a program called PRIME (Profile-based Randomized Iteration Method). The results indicate that PRIMEpiecewise is less affected by such regions than PRIMEaffine This follows the general tendency that terminal gaps reduce more significantly the accuracy of global alignment programs including Prrn, MUSCLE, POA, and ClustalW than that of MAFFT, ProbCons, and T-Coffee that incorporate local alignment information in some ways. The horizontal axis denotes reference alignment ID, and the vertical axis, the difference in sum-of-pairs or column scores on respective alignments of the full length set using PRIMEpiecewise and PRIMEaffine. The horizontal axis denotes reference alignment ID, and the vertical axis, the difference in sum-of-pairs or column scores on respective alignments of the homologous region set using PRIMEpiecewise and PRIMEaffine. Overall and Ranksum columns show the average sum-of-pairs scores and the rank sum of the Friedman test using all alignment of the whole homologous region set, respectively. The computational speed would be significantly improved by incorporating anchoring heuristics and refining source codes

Background

Construct a phylogenetic tree from the distance matrix

Discussions and Conclusion

Notredame C

Gotoh O

12. Gotoh O

19. Gotoh O

21. Gotoh O

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Dec 1, 2006
Citations: 60	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Improvement in Speed and Accuracy of Multiple Sequence Alignment Program PRIME
Shinsuke Yamada ... Hayato Yamana
IPSJ Transactions on Bioinformatics | VOL. 1
Shinsuke Yamada, et. al.Shinsuke Yamada ... Hayato Yamana
01 Jan 2008
IPSJ Transactions on Bioinformatics | VOL. 1

A hybrid algorithm for multiple DNA sequence alignment
Kokila K Perera ... C Thusangi Wannige
-
Kokila K Perera, et. al.Kokila K Perera ... C Thusangi Wannige
01 Sep 2016
01 Sep 2016

MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts
Xin Deng ... Jianlin Cheng
BMC Bioinformatics | VOL. 12
Xin Deng, et. al.Xin Deng ... Jianlin Cheng
01 Dec 2011
BMC Bioinformatics | VOL. 12

A Parallel Algorithm for Multiple Biological Sequence Alignment
Irma R Andalon-Garcia ... M E Meda-Campaña
-
Irma R Andalon-Garcia, et. al.Irma R Andalon-Garcia ... M E Meda-Campaña
01 Jan 2012
01 Jan 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics