Ab Initio Protein Structure Prediction Research Articles

Amide hydrogen-deuterium exchange (HDX) has long been used to determine regional flexibility and binding sites in proteins; however, the data are too sparse for full structural characterization. Experiments that measure HDX rates, such as HDX-NMR, have far higher throughput compared to structure determination via X-ray crystallography, cryo-EM, or a full suite of NMR experiments. Data from HDX-NMR experiments encode information on the protein structure, making HDX a prime candidate to be supplemented by computational algorithms for protein structure prediction. We have developed a methodology to incorporate HDX-NMR data into ab initio protein structure prediction using the Rosetta software framework to predict structures based on experimental agreement. To demonstrate the efficacy of our algorithm, we examined 38 proteins with HDX-NMR data available, comparing the predicted model with and without the incorporation of HDX data into scoring. The root-mean-square deviation (rmsd, a measure of the average atomic distance between superimposed models) of the predicted model improved by 1.42 Å on average after incorporating the HDX-NMR data into scoring. The average rmsd improvement for the proteins where the selected model rmsd changed after incorporating HDX data was 3.63 Å, including one improvement of more than 11 Å and seven proteins improving by greater than 4 Å, with 12/15 proteins improving overall. Additionally, for independent verification, two proteins that were not part of the original benchmark were scored including HDX data, with a dramatic improvement of the selected model rmsd of nearly 9 Å for one of the proteins. Moreover, we have developed a confidence metric allowing us to successfully identify near-native models in the absence of a native structure. Improvement in model selection with a strong confidence measure demonstrates that protein structure prediction with HDX-NMR is a powerful tool which can be performed with minimal additional computational strain and expense.

Read full abstract

The protein disulfide bond is a covalent bond that forms during post-translational modification by the oxidation of a pair of cysteines. In protein, the disulfide bond is the most frequent covalent link between amino acids after the peptide bond. It plays a significant role in three-dimensional (3D) ab initio protein structure prediction (aiPSP), stabilizing protein conformation, post-translational modification, and protein folding. In aiPSP, the location of disulfide bonds can strongly reduce the conformational space searching by imposing geometrical constraints. Existing experimental techniques for the determination of disulfide bonds are time-consuming and expensive. Thus, developing sequence-based computational methods for disulfide bond prediction becomes indispensable. This study proposed a stacking-based machine learning approach for disulfide bond prediction (diSBPred). Various useful sequence and structure-based features are extracted for effective training, including conservation profile, residue solvent accessibility, torsion angle flexibility, disorder probability, a sequential distance between cysteines, and more. The prediction of disulfide bonds is carried out in two stages: first, individual cysteines are predicted as either bonding or non-bonding; second, the cysteine-pairs are predicted as either bonding or non-bonding by including the results from cysteine bonding prediction as a feature.The examination of the relevance of the features employed in this study and the features utilized in the existing nearest neighbor algorithm (NNA) method shows that the features used in this study improve about 7.39 % in jackknife validation balanced accuracy. Moreover, for individual cysteine bonding prediction and cysteine-pair bonding prediction, diSBPred provides a 10-fold cross-validation balanced accuracy of 82.29 % and 94.20 %, respectively. Altogether, our predictor achieves an improvement of 43.25 % based on balanced accuracy compared to the existing NNA based approach. Thus, diSBPred can be utilized to annotate the cysteine bonding residues of protein sequences whose structures are unknown as well as improve the accuracy of the aiPSP method, which can further aid in experimental studies of the disulfide bond and structure determination.

Read full abstract

Ab Initio Protein Structure Prediction Research Articles

Articles published on Ab Initio Protein Structure Prediction

An outlook on structural biology after AlphaFold: tools, limits and perspectives.

Ab initio protein structure prediction: the necessary presence of external force field as it is delivered by Hsp40 chaperone

Protein structure prediction with energy minimization and deep learning approaches.

Fast and accurate Ab Initio Protein structure prediction using deep learning potentials.

Deep learning geometrical potential for high-accuracy ab initio protein structure prediction

Neprosin belongs to a new family of glutamic peptidase based on in silico evidence

On the need to introduce environmental characteristics in ab initio protein structure prediction using a coarse-grained UNRES force field

Protein Structure Prediction from NMR Hydrogen-Deuterium Exchange Data.

Protein Structure Prediction Using Population-Based Algorithm Guided by Information Entropy

DiSBPred: A machine learning based approach for disulfide bond prediction

DeepDist: real-value inter-residue distance prediction with deep residual convolutional network

A putative origin of the insect chemosensory receptor superfamily in the last common eukaryotic ancestor.

Rosetta and the Journey to Predict Proteins’ Structures, 20 Years on

Deep Learning-based Ab Initio Protein Structure Prediction and Structure-based Protein Function Annotation

Ranking near-native candidate protein structures via random forest classification

Improved fragment sampling for ab initio protein structure prediction using deep neural networks

Protein Structure Determination in Living Cells.

Performance comparison of ab initio protein structure prediction methods

Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity.

Selecting near‐native protein structures from ab initio models using ensemble clustering

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Ab Initio Protein Structure Prediction Research Articles

Articles published on Ab Initio Protein Structure Prediction

An outlook on structural biology after AlphaFold: tools, limits and perspectives.

Ab initio protein structure prediction: the necessary presence of external force field as it is delivered by Hsp40 chaperone

Protein structure prediction with energy minimization and deep learning approaches.

Fast and accurate Ab Initio Protein structure prediction using deep learning potentials.

Deep learning geometrical potential for high-accuracy ab initio protein structure prediction

Neprosin belongs to a new family of glutamic peptidase based on in silico evidence

On the need to introduce environmental characteristics in ab initio protein structure prediction using a coarse-grained UNRES force field

Protein Structure Prediction from NMR Hydrogen-Deuterium Exchange Data.

Protein Structure Prediction Using Population-Based Algorithm Guided by Information Entropy

DiSBPred: A machine learning based approach for disulfide bond prediction

DeepDist: real-value inter-residue distance prediction with deep residual convolutional network

A putative origin of the insect chemosensory receptor superfamily in the last common eukaryotic ancestor.

Rosetta and the Journey to Predict Proteins’ Structures, 20 Years on

Deep Learning-based Ab Initio Protein Structure Prediction and Structure-based Protein Function Annotation

Ranking near-native candidate protein structures via random forest classification

Improved fragment sampling for ab initio protein structure prediction using deep neural networks

Protein Structure Determination in Living Cells.

Performance comparison of ab initio protein structure prediction methods

Selecting Near-Native Protein Structures from Predicted Decoy Sets Using Ordered Graphlet Degree Similarity.

Selecting near‐native protein structures from ab initio models using ensemble clustering