Abstract

Correctly scoring protein-protein docking models to single out native-like ones is an open challenge. It is also an object of assessment in CAPRI (Critical Assessment of PRedicted Interactions), the community-wide blind docking experiment. We introduced in the field the first pure consensus method, CONSRANK, which ranks models based on their ability to match the most conserved contacts in the ensemble they belong to. In CAPRI, scorers are asked to evaluate a set of available models and select the top ten ones, based on their own scoring approach. Scorers’ performance is ranked based on the number of targets/interfaces for which they could provide at least one correct solution. In such terms, blind testing in CAPRI Round 30 (a joint prediction round with CASP11) has shown that critical cases for CONSRANK are represented by targets showing multiple interfaces or for which only a very small number of correct solutions are available. To address these challenging cases, CONSRANK has now been modified to include a contact-based clustering of the models as a preliminary step of the scoring process. We used an agglomerative hierarchical clustering based on the number of common inter-residue contacts within the models. Two criteria, with different thresholds, were explored in the cluster generation, setting either the number of common contacts or of total clusters. For each clustering approach, after selecting the top (most populated) ten clusters, CONSRANK was run on these clusters and the top-ranked model for each cluster was selected, in the limit of 10 models per target. We have applied our modified scoring approach, Clust-CONSRANK, to SCORE_SET, a set of CAPRI scoring models made recently available by CAPRI assessors, and to the subset of homodimeric targets in CAPRI Round 30 for which CONSRANK failed to include a correct solution within the ten selected models. Results show that, for the challenging cases, the clustering step typically enriches the ten top ranked models in native-like solutions. The best performing clustering approaches we tested indeed lead to more than double the number of cases for which at least one correct solution can be included within the top ten ranked models.

Highlights

  • The thousands of proteins expressed in cells perform most of their functions through interactions with other proteins [1,2]

  • To test the performance of Clust-CONSRANK, we applied it to two sets of models used in previous CAPRI scoring experiments, containing at least one correct solution to be possibly singled out

  • The second set consists of the 3 dimeric targets in the recent CAPRI round 30, for which our pure consensus scoring function, CONSRANK, failed to identify any correct solution

Read more

Summary

Introduction

The thousands of proteins expressed in cells perform most of their functions through interactions with other proteins [1,2]. Understanding protein-protein interactions and characterizing them on a structural basis is a crucial step in the investigation of many biological processes [3,4]. Many more protein complex structures could in principle be predicted by computational approaches, by macromolecular docking, reliably predicting the three-dimensional structure of protein-protein complexes is still challenging, with one of the critical steps being the scoring, i.e. the ability to discriminate between correct and incorrect solutions within a pool of models [6,7,8]. The CAPRI (Critical Assessment of PRedicted Interactions) experiment [9,10] organizes blind docking challenges and has been catalyzing the development of computational protein docking for over a decade [11,12]. Success is measured on the number of targets or interfaces for which at least one native-like model—a model of at least acceptable quality—was submitted

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call