Machine Learning Attempts for Predicting Human Subcutaneous Bioavailability of Monoclonal Antibodies.

Hao Lou,Michael J Hageman

doi:10.1007/s11095-021-03022-y

Abstract

One knowledge gap related to subcutaneous (SC) delivery is unpredictable and variable bioavailability. This study was aimed to develop machine learning methods to predict whether mAb's bioavailability was ≥70% or below, without completely knowing the mechanism and causality between inputs and outputs. A database of mAb SC products was built. The model training and validation were accomplished based on this database and a set of the inputs (product properties) were mapped to the output (bioavailability) using different machine learning algorithms. Dimensionality reduction was undertaken using principal component analysis (PCA). The bioavailability of the mAb products being investigated varied from 35% to 90%. The tree-based methods, including random forest (RF), Adaptive Boost (AdaBoost), and decision tree (DT) presented the best predictability and generalization power on bioavailability classification. The models based on Multi-layer perceptron (MLP), Gaussian Naïve Bayes (GaussianNB), and k nearest neighbor (kNN) algorithms also provided acceptable prediction accuracy. Machine learning could be a potential tool to predict mAb's bioavailability. Since all input features were acquired using theoretical calculations and predictions rather than experiments, the models may be particularly applicable to some early-stage research activities such as mAb molecule triage, design/optimization, mutant screening, molecule selection, and formulation design.

Full Text