Abstract Despite large-scale efforts to measure the effect of drug screens in cancer cell lines, mapping the effects of drugs to patient samples has been a challenge. Biological differences between cell lines and patients, such as lack of immune system or microbiome, in-vitro survival adaptations, and biases in measurement technologies create differences across sample modalities that can confound analysis including prediction with machine learning. In this work, we propose a multiway batch correction strategy to enable algorithmic prediction of tumor drug response across model systems and patient data.Recent advances in batch correction algorithms have been motivated by the need to correct for batch effects in single-cell omics and include diverse approaches such as variational autoencoders (VAEs) and generative adversarial networks (GANs). Given the successes of these generative deep learning methods in single cell sequencing analysis, we worked to employ similar approaches to correct large omics measurements across various cancer datasets. Here, we describe mapping of datasets from diverse data sources and model systems to the same space, so that a predictive model of drug response built in a system such as cell lines can be used in biologically relevant models such as organoids, patient derived xenografts, and tumor data. Specifically, we introduce a modified loss function in a VAE using cosine similarity distance to minimize the effect of different cancer model systems in predicting cancer types. We evaluate the method on standard data types for drug response prediction - gene expression, copy number variation, and protein abundance. For this method, the cosine similarity is added as an additional term to the VAE reconstruction and Kullback-Leibler divergence loss terms. This injects a quantification of the dissimilarity between the tumor and tumor model distributions into the backpropagation and gradient descent for updating the model parameters resulting in an encoded representation of the data where the effect of data source has been attenuated while preserving the phenotypic signal. We evaluate our approach for biological signal preservation while reducing model system-specific noise with logistic regression and Euclidean distance. Our results show that the proposed VAE can effectively correct for platform effects and improve the accuracy of downstream integrative analyses. This study has the potential to improve the accuracy and translatability of proteogenomic drug response studies. The proposed modified VAE could be used to correct for platform effects in a variety of datasets, including those from different studies, different platforms, and different cancer types. This could lead to new insights into cancer biology, calibration of cancer patient digital twins, and the development of new diagnostic and therapeutic strategies. Citation Format: Brian Karlberg, Raphael Kirchgaessner, Jeremy R. Jacobson, Kyle Ellrott, Sara J. Gosline. Tumor model to tumor treatment: Applying deep learning approaches to map multimodal data from cancer model systems to patients [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 7393.
Read full abstract