X‐ray crystallography is the main experimental method behind ligand–macromolecule complexes found in the Protein Data Bank (PDB). Applying bioinformatics methods to such structural data can fuel drug discovery, albeit under the condition that the information is correct. Regrettably, a small number of structures in the PDB are of suboptimal quality due to incorrectly identified and modeled ligands in protein–ligand complexes. In this paper, we combine a theoretical‐graph approach, nuclear density estimates, bioinformatics methods, and prior chemical knowledge to analyze two non‐physiological ligands, HEPES and MES, that are frequent components of crystallization and purifications buffers. Our analysis includes quantum mechanics calculations and Cambridge Structure Database (CSD) queries to define the ideal conformation of these ligands, geometry analysis of PDB deposits regarding several quality factors, and a search for homologous structures to identify other small molecules that could bind in place of the parasitic ligand. Our results highlight the need for careful refinement of macromolecule–ligand complexes and better validation tools that integrate results from all relevant resources.
Read full abstract