Abstract

The structure, function, stability, and many other properties of a protein in a fixed environment are fully specified by its sequence, but in a manner that is difficult to discern. We present a general approach for rapidly mapping sequences directly to their energies on a pre-specified rigid backbone, an important sub-problem in computational protein design and in some methods for protein structure prediction. The cluster expansion (CE) method that we employ can, in principle, be extended to model any computable or measurable protein property directly as a function of sequence. Here we show how CE can be applied to the problem of computational protein design, and use it to derive excellent approximations of physical potentials. The approach provides several attractive advantages. First, following a one-time derivation of a CE expansion, the amount of time necessary to evaluate the energy of a sequence adopting a specified backbone conformation is reduced by a factor of 107 compared to standard full-atom methods for the same task. Second, the agreement between two full-atom methods that we tested and their CE sequence-based expressions is very high (root mean square deviation 1.1–4.7 kcal/mol, R2 = 0.7–1.0). Third, the functional form of the CE energy expression is such that individual terms of the expansion have clear physical interpretations. We derived expressions for the energies of three classic protein design targets—a coiled coil, a zinc finger, and a WW domain—as functions of sequence, and examined the most significant terms. Single-residue and residue-pair interactions are sufficient to accurately capture the energetics of the dimeric coiled coil, whereas higher-order contributions are important for the two more globular folds. For the task of designing novel zinc-finger sequences, a CE-derived energy function provides significantly better solutions than a standard design protocol, in comparable computation time. Given these advantages, CE is likely to find many uses in computational structural modeling.

Highlights

  • Protein structure prediction, homology modeling, fold recognition, and design, including the prediction and design of macromolecular interactions, are among the most complex and essential problems in contemporary computational structural biology

  • We expanded the energy of a sequence adopting a particular backbone conformation, which is a necessary component for protein design and some methods for fold recognition

  • In Results we describe the application of cluster expansion (CE) to model the energetics of three different protein folds—the parallel dimeric coiled coil, the zinc finger, and the WW domain

Read more

Summary

Introduction

Homology modeling, fold recognition, and design, including the prediction and design of macromolecular interactions, are among the most complex and essential problems in contemporary computational structural biology. On the other hand, designing proteins with specific structure and function is important because of the usefulness of proteins as reagents and therapeutics [1]. At the heart of any computational approach to protein design or structure prediction lies the problem of determining the fitness (effective energy) of a particular protein in a given conformation or state. In the fold-recognition approach to structure prediction ( called threading), the goal is to identify the most suitable structure for a particular sequence, given a library of known folds. In both cases, the complexity of the problem imposes two sometimes conflicting requirements on the energy function used: physical accuracy and computational efficiency

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.