Abstract

The assignment of secondary structure elements in proteins is a key step in the analysis of their structures and functions. We have developed an algorithm, SACF (secondary structure assignment based on Cα fragments), for secondary structure element (SSE) assignment based on the alignment of Cα backbone fragments with central poses derived by clustering known SSE fragments. The assignment algorithm consists of three steps: First, the outlier fragments on known SSEs are detected. Next, the remaining fragments are clustered to obtain the central fragments for each cluster. Finally, the central fragments are used as a template to make assignments. Following a large-scale comparison of 11 secondary structure assignment methods, SACF, KAKSI and PROSS are found to have similar agreement with DSSP, while PCASSO agrees with DSSP best. SACF and PCASSO show preference to reducing residues in N and C cap regions, whereas KAKSI, P-SEA and SEGNO tend to add residues to the terminals when DSSP assignment is taken as standard. Moreover, our algorithm is able to assign subtle helices (310-helix, π-helix and left-handed helix) and make uniform assignments, as well as to detect rare SSEs in β-sheets or long helices as outlier fragments from other programs. The structural uniformity should be useful for protein structure classification and prediction, while outlier fragments underlie the structure–function relationship.

Highlights

  • In 1951, Pauling and colleagues first defined two main secondary elements (α-helix and β-sheet) based on the intra-backbone hydrogen bond patterns in proteins [1]

  • Set T consists of 2817 structures with resolutions between 2.0 and 3.0 Å, which was selected to compare our method with ten other programs, including two hydrogen bond-based secondary structural elements (SSE) assignment programs (DSSP and STRIDE) and nine geometry-based methods

  • PCASSO achieves high agreement with DSSP (93.5%) because the protein secondary structures in the training set were assigned by DSSP and 258 geometric features were used in random decision forests

Read more

Summary

Introduction

In 1951, Pauling and colleagues first defined two main secondary elements (α-helix and β-sheet) based on the intra-backbone hydrogen bond patterns in proteins [1]. They correctly detected the idealized π-helix but incorrectly predicted that 310-helix would not occur due to unfavorable angles. Approximately 4% of residues in proteins have been shown to occur in this secondary element [2]. Secondary structures have been extensively employed in structure visualization [7], classification [8], comparison [9], and prediction [10]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call