PiPred \u2013 a deep-learning method for prediction of \u03c0-helices in protein sequences

Jan Ludwiczak,Aleksander Winski,Antonio Marinho Da Silva Neto,Stanislaw Dunin-Horkawicz,Krzysztof Szczepaniak,Vikram Alva

doi:10.1038/s41598-019-43189-4

Abstract

Canonical π-helices are short, relatively unstable secondary structure elements found in proteins. They comprise seven or more residues and are present in 15% of all known protein structures, often in functionally important regions such as ligand- and ion-binding sites. Given their similarity to α-helices, the prediction of π-helices is a challenging task and none of the currently available secondary structure prediction methods tackle it. Here, we present PiPred, a neural network-based tool for predicting π-helices in protein sequences. By performing a rigorous benchmark we show that PiPred can detect π-helices with a per-residue precision of 48% and sensitivity of 46%. Interestingly, some of the α-helices mispredicted by PiPred as π-helices exhibit a geometry characteristic of π-helices. Also, despite being trained only with canonical π-helices, PiPred can identify 6-residue-long α/π-bulges. These observations suggest an even higher effective precision of the method and demonstrate that π-helices, α/π-bulges, and other helical deformations may impose similar constraints on sequences. PiPred is freely accessible at: https://toolkit.tuebingen.mpg.de/#/tools/quick2d. A standalone version is available for download at: https://github.com/labstructbioinf/PiPred, where we also provide the CB6133, CB513, CASP10, and CASP11 datasets, commonly used for training and validation of secondary structure prediction methods, with correctly annotated π-helices.

Highlights

Helices, dominant protein secondary structure elements, are defined by the recurring pattern of the hydrogen bonds between the amide hydrogen (NH) and the carbonyl oxygen (CO) atoms
To assess the functional role of π-helices, we surveyed 2,555 representative π-helices present in protein structures co-crystallized with ligands and found that 24% of them interact with at least one ligand, most frequently with protoporphyrin IX and its derivatives, nucleoside derivatives (e.g. NAD, NADP, FAD), and various ions
To systematically investigate the association between the presence of π-helices and biological functions, we performed Gene Ontology (GO) enrichment analysis, with a focus on identifying GO terms overrepresented in proteins containing π-helices

Summary

Introduction

Dominant protein secondary structure elements, are defined by the recurring pattern of the hydrogen bonds between the amide hydrogen (NH) and the carbonyl oxygen (CO) atoms. Unlike α-helices, π-helices, a less frequent type of helices, contain hydrogen bonds between residues in positions i and i + 5 (Fig. 1). Canonical π-helices are characterized by the presence of at least two π-type (i → i + 5) hydrogen bonds and the minimal length of a π-helix is seven residues[1]. The annotation of π-helices in protein structures have been developed[1,10,11], providing the possibility of identifying π-helices that are missed by the general-purpose methods. E. the accuracy of these predictors for π-helix class (“I”) is zero This can be attributed to the properties of the datasets commonly used in the secondary structure prediction problems, like CB613319,22 or CB51319,23, which contain only a small number of π-helices due to inaccuracies in DSSP9. The only method that is capable of predicting π-helices is limited to those occurring in transmembrane proteins[24]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: May 3, 2019
Citations: 19	License type: open-access

R Discovery Prime

R Discovery Prime

PiPred \u2013 a deep-learning method for prediction of \u03c0-helices in protein sequences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

In Silico Study of Secondary Structure of Hemoglobin Protein
Roma Chandra
Research Journal of Pharmacy and Technology | VOL. -
Roma ChandraRoma Chandra
28 Dec 2021
Research Journal of Pharmacy and Technology | VOL. -

CONCORD: a consensus method for protein secondary structure prediction via mixed integer linear optimization
Y Wei ... J Thompson
Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences | VOL. 468
Y Wei, et. al.Y Wei ... J Thompson
18 Nov 2011
Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences | VOL. 468

Fold-switching proteins.
Andy Liwang ... Lauren L Porter
Biopolymers | VOL. 112
Andy Liwang, et. al.Andy Liwang ... Lauren L Porter
01 Oct 2021
Biopolymers | VOL. 112

Protein Secondary Structure Prediction Based on Data Partition and Semi-Random Subspace Method
Yuming Ma ... Yihui Liu
Scientific Reports | VOL. 8
Yuming Ma, et. al.Yuming Ma ... Yihui Liu
29 Jun 2018
Scientific Reports | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PiPred \u2013 a deep-learning method for prediction of \u03c0-helices in protein sequences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports