Abstract

Machine learning techniques are being increasingly used in the analysis of clinical and omics data. This increase is primarily due to the advancements in Artificial intelligence (AI) and the build-up of health-related big data. In this paper we have aimed at estimating the likelihood of adverse drug reactions or events (ADRs) in the course of drug discovery using various machine learning methods. We have also described a novel machine learning-based framework for predicting the likelihood of ADRs. Our framework combines two distinct datasets, drug-induced gene expression profiles from Open TG–GATEs (Toxicogenomics Project–Genomics Assisted Toxicity Evaluation Systems) and ADR occurrence information from FAERS (FDA [Food and Drug Administration] Adverse Events Reporting System) database, and can be applied to many different ADRs. It incorporates data filtering and cleaning as well as feature selection and hyperparameters fine tuning. Using this framework with Deep Neural Networks (DNN), we built a total of 14 predictive models with a mean validation accuracy of 89.4%, indicating that our approach successfully and consistently predicted ADRs for a wide range of drugs. As case studies, we have investigated the performances of our prediction models in the context of Duodenal ulcer and Hepatitis fulminant, highlighting mechanistic insights into those ADRs. We have generated predictive models to help to assess the likelihood of ADRs in testing novel pharmaceutical compounds. We believe that our findings offer a promising approach for ADR prediction and will be useful for researchers in drug discovery.

Highlights

  • An adverse drug reaction (ADR) or event is defined as any unintended or undesired effect of a drug (Katzung et al, 2012; Coleman and Pontefract, 2016)

  • Among the difficulties of using the FAERS database in ADRs prediction models is the presence of reports with multiple drugs used (Multipharma), which is expected in patients with chronic diseases

  • To reduce the data dispersion caused by multiple dose levels and administration durations in Open TG-GATEs, we filtered out low quality/unsuitable

Read more

Summary

INTRODUCTION

An adverse drug reaction (ADR) or event is defined as any unintended or undesired effect of a drug (Katzung et al, 2012; Coleman and Pontefract, 2016). Open TG–GATEs (Igarashi et al, 2014) is a large–scale toxicogenomics database that collects gene expression profiles of in vivo as well as in vitro samples that have been treated with various drugs These expression profiles are an outcome of the Japanese Toxicogenomics Project (Uehara et al, 2009), which aimed to build an extensive database of drug toxicities for drug discovery. This study describes our approach to generating deep learning-based, systematic ADR prediction models This approach combines ADR occurrence data, including frequency details, from the FAERS (FDA Adverse Event Reporting System) database, with the gene expression profiles from Open TGGATEs. We show how to improve the models’ performance by applying feature selection and hyperparameter optimization algorithms. The methodologies and models described in our study offer valuable tools for assessing the likelihood of ADRs in the course of drug discovery

Overview
Model Building and Training
Evaluation and Enrichment Analysis
Data Processing
Model Evaluation
Case Study 1
Case Study 2
DISCUSSION
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call