Abstract

Drug Discovery and Repurposing against COVDI-19 is a highly relevant topic with huge efforts dedicated to deliver novel therapeutics targeting SARS-CoV-2. In this context, computer-aided drug discovery is of high interest in orienting the early high throughput screenings and in optimizing the hit identification rate. We herein propose a pipeline for Ligand-based Drug Discovery (LBDD) against SARS-CoV-2. Through an extensive search of the literature and multiple steps of filtering, we integrated information on 2610 molecules having a validated effect against SARS-CoV and/or SARS-CoV-2. The chemical structures of these molecules were encoded through multiple systems to be readily useful as input to conventional machine learning (ML) algorithms or deep learning (DL) architectures. We assessed the performances of seven ML algorithms and four DL algorithms in achieving molecule classification into two classes: Active and Inactive. Random Forests (RF), the Graph Convolutional Network (GCN) and the Directed Acyclic Graph (DAG) models achieved the best performances. These models were further optimized through hyperparameter tuning and achieved ROC-AUC scores through cross-validation of 85%, 83% and 79% for RF, GCN and DAG models, respectively. An external validation step on the FDA approved drugs collection revealed a superior potential of DL algorithms to achieve drug repurposing against SARS-CoV-2 based on the dataset herein presented. Namely, GCN and DAG achieved more than 50% of true positive rate assessed on the confirmed hits of a PubChem bioassay.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call