Abstract

Loops in proteins are flexible regions connecting regular secondary structures. They are often involved in protein functions through interacting with other molecules. The irregularity and flexibility of loops make their structures difficult to determine experimentally and challenging to model computationally. Conformation sampling and energy evaluation are the two key components in loop modeling. We have developed a new method for loop conformation sampling and prediction based on a chain growth sequential Monte Carlo sampling strategy, called Distance-guided Sequential chain-Growth Monte Carlo (DiSGro). With an energy function designed specifically for loops, our method can efficiently generate high quality loop conformations with low energy that are enriched with near-native loop structures. The average minimum global backbone RMSD for 1,000 conformations of 12-residue loops is Å, with a lowest energy RMSD of Å, and an average ensemble RMSD of Å. A novel geometric criterion is applied to speed up calculations. The computational cost of generating 1,000 conformations for each of the x loops in a benchmark dataset is only about cpu minutes for 12-residue loops, compared to ca cpu minutes using the FALCm method. Test results on benchmark datasets show that DiSGro performs comparably or better than previous successful methods, while requiring far less computing time. DiSGro is especially effective in modeling longer loops (– residues).

Highlights

  • Protein loops connect regular secondary structures and are flexible regions on protein surface

  • Recent advances in template-free loop modeling have enabled prediction of structures of long loops with impressive accuracy when crystal contacts or protein family specific information such as that of GPCR family is taken into account [14,23, 25]

  • Despite significant progress made in the past in loop modeling, current methods still cannot generate near-native loop conformations rapidly

Read more

Summary

Introduction

Protein loops connect regular secondary structures and are flexible regions on protein surface. They often play important functional roles in recognition and binding of small molecules or other proteins [1,2,3]. The flexibility and irregularity of loops make their structures difficult to resolve experimentally [4] They are challenging to model computationally [5,6]. Among existing methods for loop prediction, template-free methods build loop structures de novo through conformational search [5,6,7,9,10,13,14,17,18,21,23,28]. Recent advances in template-free loop modeling have enabled prediction of structures of long loops with impressive accuracy when crystal contacts or protein family specific information such as that of GPCR family is taken into account [14,23, 25]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call