Visualisation of variable binding pockets on protein surfaces by probabilistic analysis of related structure sets

Paul Ashford,Siew K Yeap,Alexander Alex,Irene Nobeli,Mark A Williams,Alice Povia,David S Moss

doi:10.1186/1471-2105-13-39

Abstract

BackgroundProtein structures provide a valuable resource for rational drug design. For a protein with no known ligand, computational tools can predict surface pockets that are of suitable size and shape to accommodate a complementary small-molecule drug. However, pocket prediction against single static structures may miss features of pockets that arise from proteins' dynamic behaviour. In particular, ligand-binding conformations can be observed as transiently populated states of the apo protein, so it is possible to gain insight into ligand-bound forms by considering conformational variation in apo proteins. This variation can be explored by considering sets of related structures: computationally generated conformers, solution NMR ensembles, multiple crystal structures, homologues or homology models. It is non-trivial to compare pockets, either from different programs or across sets of structures. For a single structure, difficulties arise in defining particular pocket's boundaries. For a set of conformationally distinct structures the challenge is how to make reasonable comparisons between them given that a perfect structural alignment is not possible.ResultsWe have developed a computational method, Provar, that provides a consistent representation of predicted binding pockets across sets of related protein structures. The outputs are probabilities that each atom or residue of the protein borders a predicted pocket. These probabilities can be readily visualised on a protein using existing molecular graphics software. We show how Provar simplifies comparison of the outputs of different pocket prediction algorithms, of pockets across multiple simulated conformations and between homologous structures. We demonstrate the benefits of use of multiple structures for protein-ligand and protein-protein interface analysis on a set of complexes and consider three case studies in detail: i) analysis of a kinase superfamily highlights the conserved occurrence of surface pockets at the active and regulatory sites; ii) a simulated ensemble of unliganded Bcl2 structures reveals extensions of a known ligand-binding pocket not apparent in the apo crystal structure; iii) visualisations of interleukin-2 and its homologues highlight conserved pockets at the known receptor interfaces and regions whose conformation is known to change on inhibitor binding.ConclusionsThrough post-processing of the output of a variety of pocket prediction software, Provar provides a flexible approach to the analysis and visualization of the persistence or variability of pockets in sets of related protein structures.

Highlights

Protein structures provide a valuable resource for rational drug design
Existing computational tools use a variety of methods to identify pockets, the simplest are based on local geometry and include PASS [1], LIGSITE [2], Pocket [3], PocketPicker [4], SURFNET [5], CAST [6] and fpocket [7]
Sets of variants can be derived from several sources: simulated ensembles created using Molecular Dynamics (MD) [15,16], Essential Dynamics (ED) [17], Normal Mode Analysis (NMA) [18] or constraint-based methods such as CONCOORD [19] and tCONCOORD [20]; solutionNMR conformational ensembles; multiple structures of the same protein solved in different crystal forms, or with different ligands or experimental conditions

Summary

Introduction

Protein structures provide a valuable resource for rational drug design. For a protein with no known ligand, computational tools can predict surface pockets that are of suitable size and shape to accommodate a complementary small-molecule drug. Ligand-binding conformations can be observed as transiently populated states of the apo protein, so it is possible to gain insight into ligand-bound forms by considering conformational variation in apo proteins This variation can be explored by considering sets of related structures: computationally generated conformers, solution NMR ensembles, multiple crystal structures, homologues or homology models. Proteins in solution are dynamic entities that explore conformational space over time due to side chain motions, local backbone flexibility and larger sub-domain or domain motions [13] It follows that predictions of pockets based on single static structures may fail to detect potential binding sites, or features of such sites, that result from changes in their shape and size over time. It has been shown that the structure-space explored within sets of homologues correlates with that observed with MD simulations [21], homologous superfamilies of proteins provide other potentially useful sets of variant structures

Methods

Results

Conclusion