De novo structure prediction of globular proteins aided by sequence variation-derived contacts.

Tomasz Kosciolek,David T Jones,Charlotte M Deane

doi:10.1371/journal.pone.0092197

Tomasz Kosciolek, David T Jones + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0092197

Copy DOI

Abstract

The advent of high accuracy residue-residue intra-protein contact prediction methods enabled a significant boost in the quality of de novo structure predictions. Here, we investigate the potential benefits of combining a well-established fragment-based folding algorithm – FRAGFOLD, with PSICOV, a contact prediction method which uses sparse inverse covariance estimation to identify co-varying sites in multiple sequence alignments. Using a comprehensive set of 150 diverse globular target proteins, up to 266 amino acids in length, we are able to address the effectiveness and some limitations of such approaches to globular proteins in practice. Overall we find that using fragment assembly with both statistical potentials and predicted contacts is significantly better than either statistical potentials or contacts alone. Results show up to nearly 80% of correct predictions (TM-score ≥0.5) within analysed dataset and a mean TM-score of 0.54. Unsuccessful modelling cases emerged either from conformational sampling problems, or insufficient contact prediction accuracy. Nevertheless, a strong dependency of the quality of final models on the fraction of satisfied predicted long-range contacts was observed. This not only highlights the importance of these contacts on determining the protein fold, but also (combined with other ensemble-derived qualities) provides a powerful guide as to the choice of correct models and the global quality of the selected model. A proposed quality assessment scoring function achieves 0.93 precision and 0.77 recall for the discrimination of correct folds on our dataset of decoys. These findings suggest the approach is well-suited for blind predictions on a variety of globular proteins of unknown 3D structure, provided that enough homologous sequences are available to construct a large and accurate multiple sequence alignment for the initial contact prediction step.

Highlights

For some time the importance of residue-residue contacts in protein structure prediction has been known and explored [1,2,3,4]
The protein test set is comprehensive In this study, a diverse set of 150 globular proteins was used as targets
These combinations centred around 3 main choices: i) Should all predicted contacts be used or only the most confidently predicted ones? ii) How heavily should contact information be weighted in comparison with the standard FRAGFOLD energy terms? iii) What function should be used to transform predicted contacts along with their associated precision estimates into good pseudoenergy terms? After trying various combinations, we found the optimum performance on the small validation set to be as follows: (1) Use the full list of predicted contacts produced by PSICOV, as any of the contacts can potentially contribute to the determination of a correct fold

Summary

Introduction

The importance of residue-residue contacts in protein structure prediction has been known and explored [1,2,3,4]. In terms of contact prediction methodology, there has been a significant recent degree of progress spanning simple mutual information calculations in multiple sequence alignments (MSAs) [8], statistical covariance analyses [1,2,3,9], pattern recognition techniques (e.g. Support Vector Machine and neural network based approaches) [10,11,12,13] to the ones based on. All of these methods have been recently comprehensively reviewed [14]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Mar 17, 2014
Citations: 105	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

De novo structure prediction of globular proteins aided by sequence variation-derived contacts.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments
David T Jones ... Domenico Cozzetto
Bioinformatics | VOL. 28
David T Jones, et. al.David T Jones ... Domenico Cozzetto
17 Nov 2011
Bioinformatics | VOL. 28

CMWeb: an interactive on-line tool for analysing residue-residue contacts and contact prediction methods
D Kozma ... I Simon
Nucleic Acids Research | VOL. 40
D Kozma, et. al.D Kozma ... I Simon
04 Jun 2012
Nucleic Acids Research | VOL. 40

Analysis of several key factors influencing deep learning-based inter-residue contact prediction.
Tianqi Wu ... Jianlin Cheng
Bioinformatics | VOL. 36
Tianqi Wu, et. al.Tianqi Wu ... Jianlin Cheng
30 Aug 2019
Bioinformatics | VOL. 36

Forecasting residue-residue contact prediction accuracy.
P P Wozniak ... M Kotulska
Bioinformatics (Oxford, England) | VOL. 33
P P Wozniak, et. al.P P Wozniak ... M Kotulska
26 Jun 2017
Bioinformatics (Oxford, England) | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

De novo structure prediction of globular proteins aided by sequence variation-derived contacts.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE