Abstract

Knowledge-based approaches use the statistics collected from protein data-bank structures to estimate effective interaction potentials between amino acid pairs. Empirical relations are typically employed that are based on the crucial choice of a reference state associated to the null interaction case. Despite their significant effectiveness, the physical interpretation of knowledge-based potentials has been repeatedly questioned, with no consensus on the choice of the reference state. Here we use the fact that the Flory theorem, originally derived for chains in a dense polymer melt, holds also for chain fragments within the core of globular proteins, if the average over buried fragments collected from different non-redundant native structures is considered. After verifying that the ensuing Gaussian statistics, a hallmark of effectively non-interacting polymer chains, holds for a wide range of fragment lengths, although with significant deviations at short spatial scales, we use it to define a 'bona fide' reference state. Notably, despite the latter does depend on fragment length, deviations from it do not. This allows to estimate an effective interaction potential which is not biased by the presence of correlations due to the connectivity of the protein chain. We show how different sequence-independent effective statistical potentials can be derived using this approach by coarse-graining the protein representation at varying levels. The possibility of defining sequence-dependent potentials is explored.

Highlights

  • In order to assess the hypothesis that the Flory theorem holds for fragments buried in the interior of globular proteins, we analyzed a large data-set of 7793 globular proteins

  • In data-set pruning, each protein is represented as a Statistical potentials from the Gaussian scaling behaviour of protein fragments polymer whose monomers are placed in the Cα atomic position of the N amino-acids

  • As a first result of this paper, we have confirmed that the statistical properties of an ensemble of long enough fragments, collected from different globular proteins and selected to be buried in their interior, are similar to those of Gaussian ideal chains in a polymer melt [49]

Read more

Summary

Introduction

Proteins are linear flexible hetero-polymers, made up of 20 different natural amino-acid species [1]. Most natural proteins in solution have roughly compact shapes, and are usually referred to as globular proteins. The fundamental fact about globular protein sequences is their ability to attain a compact native three-dimensional folded conformation in physiological conditions [2]. The biological functionality of proteins is intimately related to their native structures and to the dynamical properties encoded in them [3]. Quantitative theoretical modeling requires in principle a detailed description at atomic level, for example to take accurately into account the subtle yet dramatic effects that can be brought about by a single residue mutation.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call