Abstract

We report the first membrane protein–protein docking benchmark consisting of 37 targets of diverse functions and folds. The structures were chosen based on a set of parameters such as the availability of unbound structures, the modeling difficulty and their uniqueness. They have been cleaned and consistently numbered to facilitate their use in docking. Using this benchmark, we establish the baseline performance of HADDOCK, without any specific optimization for membrane proteins, for two scenarios: true interface-driven docking and ab initio docking. Despite the fact that HADDOCK has been developed for soluble complexes, it shows promising docking performance for membrane systems, but there is clearly room for further optimization. The resulting set of docking decoys, together with analysis scripts, is made freely available. These can serve as a basis for the optimization of membrane complex-specific scoring functions.

Highlights

  • The docking community makes extensive use of benchmarks for evaluating the performance of docking algorithms and constantly improving them

  • The benchmark is freely available for download from GitHub and, in addition to the reference and unbound structures, includes renumbered, docking-ready structures, reference structures and analysis scripts for the calculation of the RMSD metrics that we are reporting in this paper

  • Despite the fact that HADDOCK has not been optimized for membrane proteins, it demonstrates excellent performance in the case where high-quality interface data are available, with a 92% overall success rate when considering all 400 itw models

Read more

Summary

Introduction

The docking community makes extensive use of benchmarks for evaluating the performance of docking algorithms and constantly improving them. The same is true for DOCK/PIERR as well Their data set makes use of the Membrane Proteins of Known Structure (MPSTRUC) database as the primary source of data and contains 22 biological complexes as well 8 artificial complexes, which have been created by separating GPCR proteins into separate parts after cutting them at one of the cytosolic/extracellular loops. This data set mostly consists of GPCRs and small helical complexes. None of the aforementioned works make the structures they used available, and

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call