Annotation efforts in biosciences have focused in past years mainly on the annotation of genomic sequences. Only very limited effort has been put into annotation schemes for pharmaceutical ligands. Here we propose annotation schemes for the ligands of four major target classes, enzymes, G protein-coupled receptors (GPCRs), nuclear receptors (NRs), and ligand-gated ion channels (LGICs), and outline their usage for in silico screening and combinatorial library design. The proposed schemes cover ligand functionality and hierarchical levels of target classification. The classification schemes are based on those established by the EC, GPCRDB, NuclearDB, and LGICDB. The ligands of the MDL Drug Data Report (MDDR) database serve as a reference data set of known pharmacologically active compounds. All ligands were annotated according to the schemes when attribution was possible based on the activity classification provided by the reference database. The purpose of the ligand-target classification schemes is to allow annotation-based searching of the ligand database. In addition, the biological sequence information of the target is directly linkable to the ligand, hereby allowing sequence similarity-based identification of ligands of next homologous receptors. Ligands of specified levels can easily be retrieved to serve as comprehensive reference sets for cheminformatics-based similarity searches and for design of target class focused compound libraries. Retrospective in silico screening experiments within the MDDR01.1 database, searching for structures binding to dopamine D2, all dopamine receptors and all amine-binding class A GPCRs using known dopamine D2 binding compounds as a reference set, have shown that such reference sets are in particular useful for the identification of ligands binding to receptors closely related to the reference system. The potential for ligand identification drops with increasing phylogenetic distance. The analysis of the focus of a tertiary amine based combinatorial library compared to known amine binding class A GPCRs, peptide binding class A GPCRs, and LGIC ligands constitutes a second application scenario which illustrates how the focus of a combinatorial library can be treated quantitatively. The provided annotation schemes, which bridge chem- and bioinformatics by linking ligands to sequences, are expected to be of key utility for further systematic chemogenomics exploration of previously well explored target families.
Read full abstract