Abstract

Predicting human clearance with high accuracy from in silico-derived parameters alone is highly desirable, as it is fast, saves in vitro resources, and is animal-sparing. We derived random forest (RF) models from 1340 compounds with human intravenous pharmacokinetic (PK) data, the largest data set publicly available today. To assess the general applicability of the RF models, we systematically removed structural-therapeutic class analogues and other compounds with structural similarity from the training sets. For a quasi-prospective test set of 343 compounds, we show that RF models devoid of structurally similar compounds in the training set predict human clearance with a geometric mean fold error (GMFE) of 3.3. While the observed GMFE illustrates how difficult it is to generate a useful model that is broadly applicable, we posit that our RF models yield a more realistic assessment of how well human clearance can be predicted prospectively. We deployed the conformal prediction formalism to assess the model applicability and to determine the prediction confidence intervals for each prediction. We observed that clearance can be predicted better for renally cleared compounds than for other clearance mechanisms. We show that applying a classification model for predicting renal clearance identifies a subset of compounds for which clearance can be predicted with higher accuracy, yielding a GMFE of 2.3. In addition, our in silico RF human clearance models compared well to models derived from scaling human hepatocytes or preclinical in vivo data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call