Abstract Discovery of a candidate drug molecule through the preclinical stages is an iterative multiparametric process until a candidate is selected for investigational new drug submission enabling studies. This iterative approach takes up significant time and effort to design, synthesize, and test new compounds. A specific testing step is evaluation of half maximal inhibitory concentration (IC50) of compounds on cell line (CL) models. To augment CL testing, computational approaches predicting the IC50 of a drug can be used to guide choice of optimization as the molecule goes through hit to lead optimization onto candidate selection. DeepDSC [1] is a deep learning (DL) model developed to predict the IC50 in a specific CL of a given drug by integrating public data from large-scale drug screens and high-throughput RNAseq profiling. The Cancer Cell Line Encyclopedia (CCLE) [2] and Genomics of Drug Sensitivity (GDSC) [3] projects compiled pharmacological profiles of 23 and 139 drugs across 491 and 655 cancer CLs, respectively. A 10-fold cross validation (CV) approach was employed to train DeepDSC on the datasets independently and reported the average root mean squared error (RMSE) and the coefficient of determination (R2) on the held-out sets across the folds. However, the model’s generalizability to novel compounds not in the dataset, specifically compounds that are structurally dissimilar, is yet to be tested. In this regard, the DeepDSC model was retrained, with modifications, on a subset of the GDSC dataset - GDSC2 - to access the generalizability of DL models on BRG399, a proprietary microtubule targeting agent (MTA) developed by BPGbio Inc. GDSC2 was used since data generated for BRG399 and GDSC2 share the same potency assay (CellTiter-Glo [6]), The GDSC2 set was split into training and held-out sets using a 80-20 random split. Further, we chose to optimize the learning rate for both the autoencoder and feed forward networks (FFN), and the L2 regularization factor for the FFN only. Hyperparameters were optimized using 5-fold CV on the training set. Finally, we evaluated the predictive performance of the model on BRG399, which was not present in the training set. We encoded each compound in GDSC2 (n=234) using Circular Fingerprints [4] and observed an average Jaccard (or Tanimoto) distance (JD) [5] of 0.892 across 27261 drug pairs. BRG399 showed an average JD of 0.873 across 234 drug pairs, with the most similar compound JD of 0.785. Our retrained DeepDSC model had RMSE 1.15 and R2 0.82 on the training set and RMSE 1.19 and R2 0.86 on the held-out set. However, on BRG399 tested CLs (n=101) model had RMSE 1.84 and R2 < 0, which indicated that predictions were on average worse compared to a mean-predictor model for BRG399. These results suggest that DL models that can generalize predictions to a held-out set for compounds that were present in the training set may not accurately predict IC50 of novel compounds not in the training set. Citation Format: Nischal M. Chand, Vivek Vishnudas, Michael A. Kiebish, Anjan Thakurta, Niven R. Narain, Stephane Gesta, Gregory M. Miller. Application of a deep learning based drug sensitivity prediction model on a novel anticancer drug [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 3527.
Read full abstract