Abstract

The residue composition of a ligand binding site determines the interactions available for diffusion-mediated ligand binding, and understanding general composition of these sites is of great importance if we are to gain insight into the functional diversity of the proteome. Many structure-based drug design methods utilize such heuristic information for improving prediction or characterization of ligand-binding sites in proteins of unknown function. The Binding MOAD database if one of the largest curated sets of protein-ligand complexes, and provides a source of diverse, high-quality data for establishing general trends of residue composition from currently available protein structures. We present an analysis of 3,295 non-redundant proteins with 9,114 non-redundant binding sites to identify residues over-represented in binding regions versus the rest of the protein surface. The Binding MOAD database delineates biologically-relevant “valid” ligands from “invalid” small-molecule ligands bound to the protein. Invalids are present in the crystallization medium and serve no known biological function. Contacts are found to differ between these classes of ligands, indicating that residue composition of biologically relevant binding sites is distinct not only from the rest of the protein surface, but also from surface regions capable of opportunistic binding of non-functional small molecules. To confirm these trends, we perform a rigorous analysis of the variation of residue propensity with respect to the size of the dataset and the content bias inherent in structure sets obtained from a large protein structure database. The optimal size of the dataset for establishing general trends of residue propensities, as well as strategies for assessing the significance of such trends, are suggested for future studies of binding-site composition.

Highlights

  • Understanding general properties of protein-ligand binding sites is of great importance to gain insight into the functional diversity of the proteome

  • Who cares if there is a bias when these residues denote small-molecule binding sites? On the contrary, we find that there are residues which show a significant bias between the classes

  • We looked at the differences in propensities between enzyme and non-enzyme, valid-ligand binding sites, which have been previously shown to differ in their ligand efficiencies [25]

Read more

Summary

Introduction

Understanding general properties of protein-ligand binding sites is of great importance to gain insight into the functional diversity of the proteome. This set is well known and structurally conserved due to the functional role of the residues, and several insightful studies have summarized catalytic residue content in sets of enzymes [1,2,3] These provided insightful heuristics for predicting enzymatic sites, but the studies did not provide as much detail on non-catalytic interactions. We show how composition of bindingsite surfaces varies with number of structures analyzed; this measure of statistical significance is not presented to this extent in other studies to date Another unique aspect of this study is our examination of the binding of spurious co-crystals, such as crystallization buffers, solvents, and stray ions, which exhibits some markedly different trends than the binding of functional ligands

Author Summary
Methods
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call