Abstract
Conformational sampling is one of the bottlenecks in fragment-based protein structure prediction approaches. They generally start with a coarse-grained optimization where mainchain atoms and centroids of side chains are considered, followed by a fine-grained optimization with an all-atom representation of proteins. It is during this coarse-grained phase that fragment-based methods sample intensely the conformational space. If the native-like region is sampled more, the accuracy of the final all-atom predictions may be improved accordingly. In this work we present EdaFold, a new method for fragment-based protein structure prediction based on an Estimation of Distribution Algorithm. Fragment-based approaches build protein models by assembling short fragments from known protein structures. Whereas the probability mass functions over the fragment libraries are uniform in the usual case, we propose an algorithm that learns from previously generated decoys and steers the search toward native-like regions. A comparison with Rosetta AbInitio protocol shows that EdaFold is able to generate models with lower energies and to enhance the percentage of near-native coarse-grained decoys on a benchmark of proteins. The best coarse-grained models produced by both methods were refined into all-atom models and used in molecular replacement. All atom decoys produced out of EdaFold’s decoy set reach high enough accuracy to solve the crystallographic phase problem by molecular replacement for some test proteins. EdaFold showed a higher success rate in molecular replacement when compared to Rosetta. Our study suggests that improving low resolution coarse-grained decoys allows computational methods to avoid subsequent sampling issues during all-atom refinement and to produce better all-atom models. EdaFold can be downloaded from http://www.riken.jp/zhangiru/software/.
Highlights
Fragment based Protein Structure Prediction (PSP) algorithms have become very successful during the last decade
A protein model is only represented by a sequence of W, Y and V torsion angles triplets, each of them being associated with one residue of the target sequence
We suggest some leads for further improvements of EdaFold and fragment-based approaches
Summary
Fragment based Protein Structure Prediction (PSP) algorithms have become very successful during the last decade. This could be due to the increasing number of solved structures available in the Protein Data Bank [1], and the proven efficiency of this approach. The torsion angles determine the backbone of proteins and replacing one fragment inside a decoy can yield important modifications of the global conformation. At this point, a protein model is only represented by a sequence of W, Y and V torsion angles triplets, each of them being associated with one residue of the target sequence. Small perturbations of main chain are performed, they hardly affect the overall fold of the decoy
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.