A simplified approach to disulfide connectivity prediction from protein sequences

Marc Vincent,Paolo Frasconi,Matthieu Labbé,Andrea Passerini

doi:10.1186/1471-2105-9-20

Abstract

BackgroundPrediction of disulfide bridges from protein sequences is useful for characterizing structural and functional properties of proteins. Several methods based on different machine learning algorithms have been applied to solve this problem and public domain prediction services exist. These methods are however still potentially subject to significant improvements both in terms of prediction accuracy and overall architectural complexity.ResultsWe introduce new methods for predicting disulfide bridges from protein sequences. The methods take advantage of two new decomposition kernels for measuring the similarity between protein sequences according to the amino acid environments around cysteines. Disulfide connectivity is predicted in two passes. First, a binary classifier is trained to predict whether a given protein chain has at least one intra-chain disulfide bridge. Second, a multiclass classifier (plemented by 1-nearest neighbor) is trained to predict connectivity patterns. The two passes can be easily cascaded to obtain connectivity prediction from sequence alone. We report an extensive experimental comparison on several data sets that have been previously employed in the literature to assess the accuracy of cysteine bonding state and disulfide connectivity predictors.ConclusionWe reach state-of-the-art results on bonding state prediction with a simple method that classifies chains rather than individual residues. The prediction accuracy reached by our connectivity prediction method compares favorably with respect to all but the most complex other approaches. On the other hand, our method does not need any model selection or hyperparameter tuning, a property that makes it less prone to overfitting and prediction accuracy overestimation.

Highlights

Prediction of disulfide bridges from protein sequences is useful for characterizing structural and functional properties of proteins
The prediction accuracy reached by our connectivity prediction method compares favorably with respect to all but the most complex other approaches
Since most of the proteins inferred from genomic sequencing lack this structural information, the ab-initio prediction of disulfide bridges from protein sequences can be very useful in several molecular biology studies

Summary

Introduction

Prediction of disulfide bridges from protein sequences is useful for characterizing structural and functional properties of proteins. Since most of the proteins inferred from genomic sequencing lack this structural information, the ab-initio prediction of disulfide bridges from protein sequences can be very useful in several molecular biology studies. This computational problem has received significant attention during the last few years and a number of prediction servers have been recently developed [1,2,3,4,5]. Given known bonding state, disulfide bridges are assigned by predicting which pairs of half-cystines are linked The latter sub-problem is considerably more difficult from a machine learning perspective as it requires methods capable of predicting structured outputs. The main novelty in that method is the use of a recursive neural network that can predict the bonding probability between any pair of cysteines, so that bridges can be predicted directly from sequence (without previous knowledge of cysteine bonding state)

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jan 14, 2008
Citations: 49	License type: cc-by

R Discovery Prime

R Discovery Prime

A simplified approach to disulfide connectivity prediction from protein sequences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure
Jiangning Song ... Zheng Yuan
Bioinformatics | VOL. 23
Jiangning Song, et. al.Jiangning Song ... Zheng Yuan
17 Oct 2007
Bioinformatics | VOL. 23

Cysteine separations profiles on protein sequences infer disulfide connectivity
E Zhao ... C.-H Tsai
Bioinformatics | VOL. 21
E Zhao, et. al.E Zhao ... C.-H Tsai
07 Dec 2004
Bioinformatics | VOL. 21

CascadeML: An Automatic Neural Network Architecture Evolution and Training Algorithm for Multi-label Classification (Best Technical Paper)
Arjun Pakrashi ... Brian Mac Namee
-
Arjun Pakrashi, et. al.Arjun Pakrashi ... Brian Mac Namee
01 Jan 2019
01 Jan 2019

Developing a random forest model to quantify streamflow intermittency in Pan-Europe at a spatial resolution of 15 arc-sec
Mahdi Abbasi ... Petra Döll
-
Mahdi Abbasi, et. al.Mahdi Abbasi ... Petra Döll
15 May 2023
15 May 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A simplified approach to disulfide connectivity prediction from protein sequences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics