Evaluating the Gibbs-Donnan and volume exclusion effects during protein ultrafiltration and diafiltration (UF/DF) is crucial in biopharmaceutical process development to precisely control the concentration of the drug substance in the final formulation. Understanding the interactions between formulation excipients and proteins under these conditions requires a domain-specific knowledge of molecular-level phenomena. This study developed gradient boosted tree models to predict the Gibbs-Donnan and volume exclusion effects for amino acids and therapeutic monoclonal antibodies using simple molecular descriptors. The models' predictions were interpreted by information gain and Shapley additive explanation (SHAP) values to understand the modes of action of the antibodies and excipients and to validate the models. The results translated feature effects in machine learning to real-world molecular interactions, which were cross-referenced with existing scientific literature for verification. The models were validated in pilot-scale manufacturing runs of two antibody products requiring high levels of concentration. By only requiring a molecule's biophysicochemical descriptors and process conditions, the proposed models provide an in silico alternative to conventional UF/DF experiments to accelerate process development and boost process understanding of the underlying molecular mechanisms through rational interpretation of the models.
Read full abstract