Abstract

Compounds that are candidates for drug repurposing can be ranked by leveraging knowledge available in the biomedical literature and databases. This knowledge, spread across a variety of sources, can be integrated within a knowledge graph, which thereby comprehensively describes known relationships between biomedical concepts, such as drugs, diseases, genes, etc. Our work uses the semantic information between drug and disease concepts as features, which are extracted from an existing knowledge graph that integrates 200 different biological knowledge sources. RepoDB, a standard drug repurposing database which describes drug-disease combinations that were approved or that failed in clinical trials, is used to train a random forest classifier. The 10-times repeated 10-fold cross-validation performance of the classifier achieves a mean area under the receiver operating characteristic curve (AUC) of 92.2%. We apply the classifier to prioritize 21 preclinical drug repurposing candidates that have been suggested for Autosomal Dominant Polycystic Kidney Disease (ADPKD). Mozavaptan, a vasopressin V2 receptor antagonist is predicted to be the drug most likely to be approved after a clinical trial, and belongs to the same drug class as tolvaptan, the only treatment for ADPKD that is currently approved. We conclude that semantic properties of concepts in a knowledge graph can be exploited to prioritize drug repurposing candidates for testing in clinical trials.

Highlights

  • Drug discovery is a time-consuming and costly process

  • In this study we focused on drug-disease combinations from RepoDB for which a direct path exists between the drug and the disease, or for which the drug and the disease are indirectly connected via one intermediate concept

  • Our work demonstrates that the frequencies of semantic properties of intermediate concepts between drugs and diseases are powerful features to classify drug-disease combinations as “Approved” or “Terminated”

Read more

Summary

Introduction

Drug discovery is a time-consuming and costly process. Despite the exponential advancements in biological and information technologies, the number of new drugs introduced in the clinic has failed to advance [1]. Smalheiser and Swanson were the first to demonstrate that the knowledge published in the biomedical literature could be computationally analyzed to identify and prioritize new drug therapies for diseases[2]. Many knowledge-graph methods have already been developed to identify new drug therapies for diseases[7,8,9,10,11,12,13,14]. Most of these methods are based on similarity between drugs or diseases[7,8,9,10]. A drawback of this method is that the suggested drug-disease combinations cannot distinguish between a drug-treatment relationship, a drug-side effect relationship, or a “does not treat” type of relationship

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call