A central aim of biological research is to elucidate the many roles of proteins in complex, dynamic living systems; the selective perturbation of protein function is an important tool in achieving this goal. Because chemical perturbations offer opportunities often not accessible with genetic methods, the development of small-molecule modulators of protein function is at the heart of chemical biology research. In this endeavor, the identification of biologically relevant starting points within the vast chemical space available for the design of compound collections is a particularly relevant, yet difficult, task. In this Account, we present our research aimed at linking chemical and biological space to define suitable starting points that guide the synthesis of compound collections with biological relevance. Both protein folds and natural product (NP) scaffolds are highly conserved in nature. Whereas different amino acid sequences can make up ligand-binding sites in proteins with highly similar fold types, differently substituted NPs characterized by particular scaffold classes often display diverse biological activities. Therefore, we hypothesized that (i) ligand-binding sites with similar ligand-sensing cores embedded in their folds would bind NPs with similar scaffolds and (ii) selectivity is ensured by variation of both amino acid side chains and NP substituents. To investigate this notion in compound library design, we developed an approach termed biology-oriented synthesis (BIOS). BIOS employs chem- and bioinformatic methods for mapping biologically relevant chemical space and protein space to generate hypotheses for compound collection design and synthesis. BIOS also provides hypotheses for potential bioactivity of compound library members. On the one hand, protein structure similarity clustering (PSSC) is used to identify ligand binding sites with high subfold similarity, that is, high structural similarity in their ligand-sensing cores. On the other hand, structural classification by scaffold trees (for example, structural classification of natural products or SCONP), when combined with software tools like "Scaffold Hunter", enables the hierarchical structural classification of small-molecule collections in tree-like arrangements, their annotation with bioactivity data, and the intuitive navigation of chemical space. Brachiation (in a manner analogous to tree-swinging primates) within the scaffold trees serves to identify new starting points for the design and synthesis of small-molecule libraries, and PSSC may be used to select potential protein targets. The introduction of chemical diversity in compound collections designed according to the logic of BIOS is essential for the frequent identification of small molecules with diverse biological activities. The continuing development of synthetic methodology, both on solid phase and in solution, enables the generation of focused small-molecule collections with sufficient substituent, stereochemical, and scaffold diversity to yield comparatively high hit rates in biochemical and biological screens from relatively small libraries. BIOS has also allowed the identification of new ligand classes for several different proteins and chemical probes for the study of protein function in cells.
Read full abstract