Abstract

Our previous work with fragment-assembly methods has demonstrated specific deficiencies in conformational sampling behaviour that, when addressed through improved sampling algorithms, can lead to more reliable prediction of tertiary protein structure when good fragments are available, and when score values can be relied upon to guide the search to the native basin. In this paper, we present preliminary investigations into two important questions arising from more difficult prediction problems. First, we investigated the extent to which native-like conformational states are generated during multiple runs of our search protocols. We determined that, in cases of difficult prediction, native-like decoys are rarely or never generated. Second, we developed a scheme for decoy retention that balances the objectives of retaining low-scoring structures and retaining conformationally diverse structures sampled during the course of the search. Our method succeeds at retaining more diverse sets of structures, and, for a few targets, more native-like solutions are retained as compared to our original, energy-based retention scheme. However, in general, we found that the rate at which native-like structural states are generated has a much stronger effect on eventual distributions of predictive accuracy in the decoy sets, as compared to the specific decoy retention strategy used. We found that our protocols show differences in their ability to access native-like states for some targets, and this may explain some of the differences in predictive performance seen between these methods. There appears to be an interaction between fragment sets and move operators, which influences the accessibility of native-like structures for given targets. Our results point to clear directions for further improvements in fragment-based methods, which are likely to enable higher accuracy predictions.

Highlights

  • In fragment-based protein structure prediction, short fragments of structural information are assembled into complete three-dimensional structures, typically using Monte Carlo strategies

  • We evaluate the effects of native-like local minimum (LMin) accessibility and choice of archiving strategy on distributions of predictive accuracy

  • The elitist random (ER) archiving strategy typically retains the most structurally diverse decoy sets, followed by the stochastic ranking-based archiver using contact maps (SRCM) and energy-based archivers. This reflects the fact that it retains a mostly random selection of LMin structures, which tend to be quite structurally diverse owing to the application of perturbation and local search steps in our search protocols. These results confirm that the SRCM and ER archivers can be used to evaluate the impact of inaccurate score values on decoy retention, as discussed in the Introduction, as they place importance on structural dissimilarity in addition to score values

Read more

Summary

Introduction

In fragment-based protein structure prediction, short fragments of structural information are assembled into complete three-dimensional structures, typically using Monte Carlo strategies. Fragment assembly protocols make use of heuristic optimisation procedures, whereby putative structures are selected or rejected by minimising the value of energy or scoring functions [1,2]. In cases where good LMins are accessed only rarely, it would be valuable to investigate whether and how relatively high-energy but otherwise promising structures can be retained, without fundamentally altering the behaviour of the search algorithms used. This issue necessitates the use of criteria other than score values to decide which structures should be retained during the search.

Alternative Decoy Acceptance Criteria in Protein Structure Prediction
Multiobjectivisation by Scoring Function Decomposition
Diversity-Based Decoy Acceptance Criteria
Outline and Contributions of This Study
Results
Discussion
Materials and Methods
Assessment of LMin Accessibility and Predictive Accuracy
Stochastic Ranking-Based Archiving Strategy
Elitist Step and Choice of Ranking Criteria
Implementation within the Bilevel and ILS Protocols
Experimental Setup
Intra-Archive Diversity Assessment over the Course of Each Run
Number of Native-Like Solutions Retained
Code and Data Availability
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call