Abstract
BackgroundDisulfide bonds play an important role in protein folding and structure stability. Accurately predicting disulfide bonds from protein sequences is important for modeling the structural and functional characteristics of many proteins.MethodsIn this work, we introduce an approach of enhancing disulfide bonding prediction accuracy by taking advantage of context-based features. We firstly derive the first-order and second-order mean-force potentials according to the amino acid environment around the cysteine residues from large number of cysteine samples. The mean-force potentials are integrated as context-based scores to estimate the favorability of a cysteine residue in disulfide bonding state as well as a cysteine pair in disulfide bond connectivity. These context-based scores are then incorporated as features together with other sequence and evolutionary information to train neural networks for disulfide bonding state prediction and connectivity prediction.ResultsThe 10-fold cross validated accuracy is 90.8% at residue-level and 85.6% at protein-level in classifying an individual cysteine residue as bonded or free, which is around 2% accuracy improvement. The average accuracy for disulfide bonding connectivity prediction is also improved, which yields overall sensitivity of 73.42% and specificity of 91.61%.ConclusionsOur computational results have shown that the context-based scores are effective features to enhance the prediction accuracies of both disulfide bonding state prediction and connectivity prediction. Our disulfide prediction algorithm is implemented on a web server named "Dinosolve" available at: http://hpcr.cs.odu.edu/dinosolve.
Highlights
Disulfide bonds play an important role in protein folding and structure stability
One can find that the context-based features with window sizes 3 and 5 slightly improve the prediction accuracy compared to using Position Specific Scoring Matrix (PSSM) only
The context-based features with window size 7 yield the optimal performance. This is mainly due to the fact that the context-based features with window size 7 take the important i - i+3 residue correlations into account, where such correlations are often found in many motifs where cysteine is involved, such as Cys-X-X-Cys, Cys-XX-Ser, Cys-X-X-His, Cys-X-X-Pro, Cys-X-X-Asp, etc
Summary
Disulfide bonds play an important role in protein folding and structure stability. Predicting disulfide bonds from protein sequences is important for modeling the structural and functional characteristics of many proteins. Disulfide bonds are often found in extracellular proteins, which play an important role in folding and enhancing thermodynamic and mechanical stability. Correctly predicting the formation and connectivity of disulfide bonds can reduce the conformational space to aid modeling protein structures in three dimensions, and help predict important protein functions. The first stage is the bonding state prediction, whose goal is to determine whether each cysteine residue in a protein chain is involved in forming a disulfide bond or not. The second stage carries out the connectivity prediction, where cysteine pairs likely to form disulfide bonds are identified
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.