Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility

Atsushi Hijikata,Kei Yura,Tosiyuki Noguti,Mitiko Go

doi:10.1002/prot.23011

Abstract

In comparative modeling, the quality of amino acid sequence alignment still constitutes a major bottleneck in the generation of high quality models of protein three-dimensional (3D) structures. Substantial efforts have been made to improve alignment quality by revising the substitution matrix, introducing multiple sequences, replacing dynamic programming with hidden Markov models, and incorporating 3D structure information. Improvements in the gap penalty have not been a major focus, however, following the development of the affine gap penalty and of the secondary structure dependent gap penalty. We revisited the correlation between protein 3D structure and gap location in a large protein 3D structure data set, and found that the frequency of gap locations approximated to an exponential function of the solvent accessibility of the inserted residues. The nonlinearity of the gap frequency as a function of accessibility corresponded well to the relationship between residue mutation pattern and residue accessibility. By introducing this relationship into the gap penalty calculation for pairwise alignment between template and target amino acid sequences, we were able to obtain a sequence alignment much closer to the structural alignment. The quality of the alignments was substantially improved on a pair of sequences with identity in the “twilight zone” between 20 and 40%. The relocation of gaps by our new method made a significant improvement in comparative modeling, exemplified here by the Bacillus subtilis yitF protein. The method was implemented in a computer program, ALAdeGAP (ALignment with Accessibility dependent GAp Penalty), which is available at http://cib.cf.ocha.ac.jp/target_protein/. Proteins 2011; © 2011 Wiley-Liss, Inc.

Highlights

Most of the proteins perform their function after forming their three-dimensional (3D) structures
Implementation of the gap penalty into standard sequence alignment method We developed a program for pairwise amino acid sequence alignment based on the assumption that one of the sequences has a known 3D structure and the other does not
We could not find any obvious relationship between gap accessibility and, for instance, secondary structure that may account for the observed change in slope in figure

Summary

Introduction

Most of the proteins perform their function after forming their three-dimensional (3D) structures. Knowledge of protein 3D structure is, essential for understanding the mechanisms of protein function in atomic detail.[1] a large number of protein structures have been determined systematically by struc-. Gap Relocation in Sequence Alignment tural genomics projects,[2,3] with the goal of elucidating the function of proteins known from genome sequences. Template-based comparative modeling, based on protein family classification, is currently the most promising method for narrowing the gap between the number of structure known and unknown proteins.[6,7]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Proteins: Structure, Function, and Bioinformatics	Publication Date: Apr 4, 2011
Citations: 26	License type: unspecified-oa

R Discovery Prime

R Discovery Prime

Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proteins: Structure, Function, and Bioinformatics

Lead the way for us

Similar Papers

柏木人工林林窗位置对香椿细根分解及土壤真菌群落多样性的影响
李德会，李相君，吴庆贵，尹海峰，李贤伟 Li Dehui
Acta Ecologica Sinica | VOL. 42
李德会，李相君，吴庆贵，尹海峰，李贤伟 Li Dehui李德会，李相君，吴庆贵，尹海峰，李贤伟 Li Dehui
01 Jan 2021
Acta Ecologica Sinica | VOL. 42

Protein sequence alignment with family-specific amino acid similarity matrices
Igor B Kuznetsov
BMC Research Notes | VOL. 4
Igor B KuznetsovIgor B Kuznetsov
16 Aug 2011
BMC Research Notes | VOL. 4

Alignment of nucleotide or amino acid sequences on microcomputers, using a modification of Sellers' (1974) algorithm which avoids the need for calculation of the complete distance matrix
Hugh Tyson ... Bryan Haley
Computer Methods and Programs in Biomedicine | VOL. 21
Hugh Tyson, et. al.Hugh Tyson ... Bryan Haley
01 Oct 1985
Computer Methods and Programs in Biomedicine | VOL. 21

ALIGN_MTX—An optimal pairwise textual sequence alignment program, adapted for using in sequence-structure alignment
Boris Vishnepolsky ... Malak Pirtskhalava
Computational Biology and Chemistry | VOL. 33
Boris Vishnepolsky, et. al.Boris Vishnepolsky ... Malak Pirtskhalava
04 May 2009
Computational Biology and Chemistry | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proteins: Structure, Function, and Bioinformatics