Abstract

Native proteins have been optimized by evolution simultaneously for structure and sequence. Structural databases reflect this interdependency. In this paper, we present a new statistical potential for a reduced backbone representation that has both structure and sequence characteristics as variables. We use information from structural data available in the Protein Coil Library, selected on the basis of resolution and refinement factor. In these structures, the nonlocal interactions are randomly distributed and, thus, average out in statistics, so structural propensities due to local backbone-based interactions can be studied separately. We collect data in the form of local sequence-specific phi-psi backbone dihedral pairs. From these data, we construct dihedral probability density functions (DPDFs) that quantify any adjacent phi-psi pair distribution in the context of all possible combinations of local residue types. We use a probabilistic analysis to deduce how the correlations encoded in the various DPDFs as well as in residue frequencies propagate along the sequence and can be cumulated in a statistical potential capable of efficiently scoring a loop by its backbone conformation and sequence only. Our potential is able to identify with high accuracy the native structure of a loop with a given sequence among possible alternative conformations from sets of well-constructed decoys. Conversely, the potential can also be used for sequence prediction problems and is shown to score the native sequence of a given loop structure among the most fit of the possible sequence combinations. Applications for both structure prediction and sequence design are discussed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call