Abstract

A central issue in drug risk-benefit assessment is identifying frequencies of side effects in humans. Currently, frequencies are experimentally determined in randomised controlled clinical trials. We present a machine learning framework for computationally predicting frequencies of drug side effects. Our matrix decomposition algorithm learns latent signatures of drugs and side effects that are both reproducible and biologically interpretable. We show the usefulness of our approach on 759 structurally and therapeutically diverse drugs and 994 side effects from all human physiological systems. Our approach can be applied to any drug for which a small number of side effect frequencies have been identified, in order to predict the frequencies of further, yet unidentified, side effects. We show that our model is informative of the biology underlying drug activity: individual components of the drug signatures are related to the distinct anatomical categories of the drugs and to the specific drug routes of administration.

Highlights

  • A central issue in drug risk-benefit assessment is identifying frequencies of side effects in humans

  • Our analysis of R showed that drug side effects follow a long-tailed distribution (Supplementary Fig. 2), where about 30% of the side effects are responsible for 80% of the associations (Fig. 1a)

  • We found that the distribution of scores predicted for this randomised test set was significantly lower than any of the post-marketing test sets (OFFSIDES test set vs randomised set, one-tailed Wilcoxon sumrank significance, P < 1.45 × 10−177; Side effect Resource (SIDER) post-marketing vs randomised set, P < 2.23 × 10−308)

Read more

Summary

Introduction

A central issue in drug risk-benefit assessment is identifying frequencies of side effects in humans. Our matrix decomposition algorithm learns latent signatures of drugs and side effects that are both reproducible and biologically interpretable. 1234567890():,; The estimation of the frequencies of the side effects is crucial in drug risk–benefit[1] assessment These frequencies are estimated using intervention and placebo groups during randomised controlled trials. It is well recognised that numerous side effects are not observed during clinical trials[4] but are only identified after the drug has reached the market[5,6,7]. Our approach for predicting the frequencies of drug side effects is to use a matrix decomposition algorithm that learns a small set of latent features (or signatures) that encode the biological interplay between drugs and side effects. Our predictions are explainable, and the individual features can be interpreted in terms of drug effects on specific human physiological systems. We show that these features are related to different routes of administration and that they capture shared drug clinical activity, drug targets and anatomy/physiology of side effect phenotypes

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call