Abstract

Drug repurposing plays an important role in screening old drugs for new therapeutic efficacy. The existing methods commonly treat prediction of drug-target interaction as a problem of binary classification, in which a large number of randomly sampled drug-target pairs accounting for over 50% of the entire training dataset are necessarily required. Such a large number of negative examples that do not come from experimental observations inevitably decrease the credibility of predictions. In this study, we propose a multi-label learning framework to find new uses for old drugs and discover new drugs for known target genes. In the framework, each drug is treated as a class label and its target genes are treated as the class-specific training data to train a supervised learning model of l2-regularized logistic regression. As such, the inter-drug associations are explicitly modelled into the framework and all the class-specific training data come from experimental observations. In addition, the data constraint is less demanding, for instance, the chemical substructures of a drug are no longer needed and the novel target genes are inferred only from the underlying patterns of the known genes targeted by the drug. Stratified multi-label cross-validation shows that 84.9% of known target genes have at least one drug correctly recognized, and the proposed framework correctly recognizes 86.73% of the independent test drug-target interactions (DTIs) from DrugBank. These results show that the proposed framework could generalize well in the large drug/class space without the information of drug chemical structures and target protein structures. Furthermore, we use the trained model to predict new drugs for the known target genes, identify new genes for the old drugs, and infer new associations between old drugs and new disease phenotypes via the OMIM database. Gene ontology (GO) enrichment analyses and the disease associations reported in recent literature provide supporting evidences to the computational results, which potentially shed light on new clinical therapies for new and/or old disease phenotypes.

Highlights

  • Drug repurposing develops new uses for the existing or abandoned drugs to accelerate the process of drug discovery and decrease the development cost

  • Under the therapeutic concept of “one drug multiple targets”, polypharmacology has opened a new avenue to rational development of more effective but less toxic therapeutic agents in recent years [2,3,4]

  • A disease phenotype is often associated with multiple disease genes, urging us to develop a therapeutic policy of drug combination to increase drug efficacy [5,6]; on the other hand, a drug molecule often targets multiple target genes [7], implicating that an old drug could be reused as a therapy for new disease, i.e., drug repurposing [8]

Read more

Summary

Introduction

Drug repurposing develops new uses for the existing or abandoned drugs to accelerate the process of drug discovery and decrease the development cost. Deep learning is well-known for its ability of automatically embedding feature information into multiple hidden layers of neural network representations For this reason, deep learning has been used to extract features from target protein sequences for drug-target interaction prediction [27,31,32]. All the other machine learning methods train one global binary model, using all the known interacting pairs as positive training data and using randomly sampled drug-target pairs as negative training data In these methods, drug-target pairs are generally represented with drug and protein structures. Liu et al [26] integrate chemical structures, chemical expression profiles, side effects of compounds, amino acid sequences, protein–protein interaction network, and gene ontology (GO) annotations of proteins to screen negative data These binary classification methods have demonstrated its efficacy in predicting drug-target interaction, there are several major concerns to be addressed. We used the model to predict new drugs for all the known target genes, and further associate these new drugs with disease phenotypes via the OMIM database [5] to repurpose these drugs

Flowchart Overview
Feature Construction
L2-Regularized Logistic Regression
Stratified Multi-Label Cross-Validation and Experimental Setup
Model Evaluation Metrics
Five-Fold Stratified Multi-Label Cross-Validation
Validation against DrugBank and Matador
Comparison with the Existing Methods
Predictions for Drug Repurposing
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call