Abstract

Traditional techniques to identify macromolecular targets for drugs utilize solely the information on a query drug and a putative target. Nonetheless, the mechanisms of action of many drugs depend not only on their binding affinity toward a single protein, but also on the signal transduction through cascades of molecular interactions leading to certain phenotypes. Although using protein-protein interaction networks and drug-perturbed gene expression profiles can facilitate system-level investigations of drug-target interactions, utilizing such large and heterogeneous data poses notable challenges. To improve the state-of-the-art in drug target identification, we developed GraphDTI, a robust machine learning framework integrating the molecular-level information on drugs, proteins, and binding sites with the system-level information on gene expression and protein-protein interactions. In order to properly evaluate the performance of GraphDTI, we compiled a high-quality benchmarking dataset and devised a new cluster-based cross-validation protocol. Encouragingly, GraphDTI not only yields an AUC of 0.996 against the validation dataset, but it also generalizes well to unseen data with an AUC of 0.939, significantly outperforming other predictors. Finally, selected examples of identified drugtarget interactions are validated against the biomedical literature. Numerous applications of GraphDTI include the investigation of drug polypharmacological effects, side effects through offtarget binding, and repositioning opportunities.

Highlights

  • Comprehensive knowledge of system-level interactions between small organic molecules and their macromolecular targets is of paramount importance to modern drug discovery

  • The paradigm in drug discovery has shifted from the concept of “one gene, one drug, one disease” to a system-level approach in order to account for the enormous complexity of biological systems involving the information propagation through numerous molecular interactions in a cell and the simultaneous effects of pharmacotherapy on multiple biological processes

  • In GraphDTI, an undirected, weighted subgraph containing a central node corresponding to the target with multiple connected nodes representing interacting proteins, is extracted from the entire human protein-protein interaction (PPI) network

Read more

Summary

Introduction

Comprehensive knowledge of system-level interactions between small organic molecules and their macromolecular targets is of paramount importance to modern drug discovery. Structurebased IVS techniques employ molecular docking to screen a ligand against a database of proteins in order to find a subset of binding sites that are putative targets for the query molecule [6]. An example of a docking-based method is TarFisDock [7], a webserver utilizing the docking program DOCK [8] to dock small molecules into either the Potential Drug-Target Database containing 698 protein structures [9], or a custom list of target sites provided by a user. TarFisDock predicted 10 putative targets for 4 H-tamoxifen and 12 for vitamin E, many of which are experimentally verified targets Another docking-based IVS program is idTarget employing a divide-and-conquer docking approach combined with quantum chemical charge models and robust regression-based scoring functions [10]. To constrain the search space for a putative binding site for a query ligand, a large docking box, initially covering the entire surface of a target protein, is constructed and its size is dynamically reduced to smaller grids. idTarget conducts screens against most protein structures present in the Protein Data Bank [11] and has been demonstrated to be able to reproduce known off-targets of drugs and drug-like compounds

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call