Iterative refinement of structure-based sequence alignments by Seed Extension

Changhoon Kim,Byungkook Lee,Chin-Hsien Tai

doi:10.1186/1471-2105-10-210

Abstract

BackgroundAccurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension) that iteratively refines a structure-based sequence alignment.ResultsRSE uses SE (Seed Extension) in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs.ConclusionRSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs.

Highlights

Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone
We have shown that SE, which is not based on the dynamic programming algorithm and does not use a gap penalty, generates a more accurate alignment on average than programs that use a dynamic programming algorithm
The improvements were small for DaliLite and MATRAS but about 5% for CE and VAST

Summary

Introduction

Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structurebased sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. It is often difficult to obtain accurate sequence alignments based on sequence similarity alone when sequence similarity is low. Some methods are probably quite good at detecting structural similarity, yet relatively poor in terms of the accuracy of the sequence alignment they produce [12]

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: Jul 9, 2009
Citations: 47	License type: cc-by

R Discovery Prime

R Discovery Prime

Iterative refinement of structure-based sequence alignments by Seed Extension

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

Accuracy of structure-based sequence alignment of automatic methods
Changhoon Kim ... Byungkook Lee
BMC bioinformatics | VOL. 8
Changhoon Kim, et. al.Changhoon Kim ... Byungkook Lee
20 Sep 2007
BMC bioinformatics | VOL. 8

SupeRNAlign: a new tool for flexible superposition of homologous RNA structures and inference of accurate structure-based sequence alignments.
Paweł Piątkowski ... Elżbieta Jankowska
Nucleic acids research | VOL. 45
Paweł Piątkowski, et. al.Paweł Piątkowski ... Elżbieta Jankowska
20 Jul 2017
Nucleic acids research | VOL. 45

Formatt: Correcting protein multiple structural alignments by incorporating sequence alignment.
Noah M Daniels ... Lenore J Cowen
BMC bioinformatics | VOL. 13
Noah M Daniels, et. al.Noah M Daniels ... Lenore J Cowen
06 Oct 2012
BMC bioinformatics | VOL. 13

SE: an algorithm for deriving sequence alignment from a pair of superimposed structures
Chin-Hsien Tai ... Changhoon Kim
BMC bioinformatics | VOL. 10
Chin-Hsien Tai, et. al.Chin-Hsien Tai ... Changhoon Kim
01 Jan 2009
BMC bioinformatics | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Iterative refinement of structure-based sequence alignments by Seed Extension

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics