Abstract

BackgroundResidues in a protein might be buried inside or exposed to the solvent surrounding the protein. The buried residues usually form hydrophobic cores to maintain the structural integrity of proteins while the exposed residues are tightly related to protein functions. Thus, the accurate prediction of solvent accessibility of residues will greatly facilitate our understanding of both structure and functionalities of proteins. Most of the state-of-the-art prediction approaches consider the burial state of each residue independently, thus neglecting the correlations among residues.ResultsIn this study, we present a high-order conditional random field model that considers burial states of all residues in a protein simultaneously. Our approach exploits not only the correlation among adjacent residues but also the correlation among long-range residues. Experimental results showed that by exploiting the correlation among residues, our approach outperformed the state-of-the-art approaches in prediction accuracy. In-depth case studies also showed that by using the high-order statistical model, the errors committed by the bidirectional recurrent neural network and chain conditional random field models were successfully corrected.ConclusionsOur methods enable the accurate prediction of residue burial states, which should greatly facilitate protein structure prediction and evaluation.

Highlights

  • Residues in a protein might be buried inside or exposed to the solvent surrounding the protein

  • These feature vectors along with burial state labels are inputted into a machine learning model such as artificial neural network (ANN) [5, 8, 12,13,14,15,16,17,18,19,20,21], support vector machine (SVM) [9, 10, 20, 22,23,24], deep learning model [25], conditional neural field (CNF) [2], and random forest (RF) [6] for training

  • We present a high-order conditional random field (CRF) model to explicitly exploit the correlations among all residues rather than consider each residue individually

Read more

Summary

Results

We present a high-order conditional random field model that considers burial states of all residues in a protein simultaneously. Our approach exploits the correlation among adjacent residues and the correlation among long-range residues. Experimental results showed that by exploiting the correlation among residues, our approach outperformed the state-of-the-art approaches in prediction accuracy. In-depth case studies showed that by using the high-order statistical model, the errors committed by the bidirectional recurrent neural network and chain conditional random field models were successfully corrected

Background
Results and discussion
Conclusion
Inferring procedure
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call