Abstract

The recent breakthrough in the field of protein structure prediction shows the relevance of using knowledge-based based scoring functions in combination with a low-resolution 3D representation of protein macromolecules. The choice of not using all atoms is barely supported by any data in the literature, and is mostly motivated by empirical and practical reasons, such as the computational cost of assessing the numerous folds of the protein conformational space. Here, we present a comprehensive study, carried on a large and balanced benchmark of predicted protein structures, to see how different types of structural representations rank in either accuracy or calculation speed, and which ones offer the best compromise between these two criteria. We tested ten representations, including low-resolution, high-resolution, and coarse-grained approaches. We also investigated the generalization of the findings to other formalisms than the widely-used "potential of mean force" (PMF) method. Thus, we observed that representing protein structures by their β carbons-combined or not with Cα-provides the best speed-accuracy trade-off, when using a "total information gain" scoring function. For statistical PMFs, using MARTINI backbone and side-chains beads is the best option. Finally, we also demonstrated the necessity of training the reference state on all atom types, and of including the Cα atoms of glycine residues, in a Cβ-based representation.

Highlights

  • The protein folding ranks among the most important unsolved problems in science [1]

  • When devising a scoring function to evaluate the folds found by sampling protein conformational space, accuracy is the primary criterion

  • We have compared the accuracy of statistical potentials based on 10 different representations of protein structure, and following 2 different formalisms (Table 2)

Read more

Summary

Introduction

The protein folding ranks among the most important unsolved problems in science [1]. B. Anfinsen for demonstrating the thermodynamic spontaneity of this process, researchers have wondered how to predict the three-dimensional conformation of the polypeptide chain, based on the sole amino acid sequence. Anfinsen for demonstrating the thermodynamic spontaneity of this process, researchers have wondered how to predict the three-dimensional conformation of the polypeptide chain, based on the sole amino acid sequence This scientific question could even be dated ten years earlier, since the Xray crystallographic study of the structure of myoglobin, by M. The critical nature of the problem arises from the facts that (i) protein function results from the 3D structure, through dynamical features and interactions with other biomolecules, and (ii) the experimental determination of native conformations remains challenging, despite the recent advances in cryogenic electron microscopy techniques. The development of a computational method that could accurately predict protein fold would have a

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.