Drug response prediction in patient-derived xenografts with data augmentation and multimodal deep learning.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

e13572 Background: Prediction of drug response is a critical research area in precision oncology and has been previously explored with large drug screening studies of cancer cell lines (CCLs). Patient-derived xenografts (PDXs) are an appealing platform for preclinical drug studies because the in vivo environment of PDXs helps preserve tumor heterogeneity and usually better mimics drug response of patients with cancer compared to CCLs. Methods: We investigate multimodal neural network (NN) and data augmentation for drug response prediction in PDXs. The multimodal NN learns to predict response using drug descriptors, gene expressions (GE), and histology whole-slide images (WSIs) where the multi-modality refers to tumor features only. The NN uses late integration where separate subnetworks are used to encode the input feature types before concatenation and prediction layers. Median tumor volume per treatment group is assessed relative to the control group to create a binary variable representing response. The data include twelve single-drug and 36 drug-pair treatments resulting in 2,556 single-drug and 2,203 drug-pair response values. Pathology and omics data from 487 PDXs from NCI's Patient Derived Models Repository are used as tumor feature model inputs. We explore whether the integration of WSIs with GE improves predictions as compared with models that use GE alone. We use two methods to address the limited number of response values in the dataset: 1) homogenize drug representations which allows to combine single-drug and drug-pairs into a single dataset, 2) augment drug-pair samples by switching the order of drug features which doubles the sample size of all drug-pair samples. These methods enable us to combine single-drug and drug-pair treatments which results in 6,962 responses, allowing us to train multimodal and unimodal NNs without changing architectures or the dataset. Results: Prediction performance of three unimodal NNs which use GE (um1, um2, and um3) are compared to assess the contribution of data augmentation methods. NN um1 that uses the full dataset which includes the original and the augmented drug-pair treatments as well as single-drug treatments significantly outperforms NNs (p-values < 0.01) that ignore either the augmented drug-pairs (um2) or the single-drug treatments (um3). In assessing the contribution of multimodal learning, results show that the multimodal NN (mm) outperforms both unimodal NNs that ignore either the GE (um4) or the WSIs (um1). However, the improvement of mm over um1 is not statistically significant (p-value < 0.26). Conclusions: Our results show that data augmentation and integration of histology images and GE can help improve prediction performance of drug response in PDXs.[Table: see text]

Similar Papers
  • Research Article
  • Cite Count Icon 10
  • 10.3389/fmed.2023.1058919
Data augmentation and multimodal learning for predicting drug response in patient-derived xenografts from gene expressions and histology images
  • Mar 7, 2023
  • Frontiers in Medicine
  • Alexander Partin + 9 more

Patient-derived xenografts (PDXs) are an appealing platform for preclinical drug studies. A primary challenge in modeling drug response prediction (DRP) with PDXs and neural networks (NNs) is the limited number of drug response samples. We investigate multimodal neural network (MM-Net) and data augmentation for DRP in PDXs. The MM-Net learns to predict response using drug descriptors, gene expressions (GE), and histology whole-slide images (WSIs). We explore whether combining WSIs with GE improves predictions as compared with models that use GE alone. We propose two data augmentation methods which allow us training multimodal and unimodal NNs without changing architectures with a single larger dataset: 1) combine single-drug and drug-pair treatments by homogenizing drug representations, and 2) augment drug-pairs which doubles the sample size of all drug-pair samples. Unimodal NNs which use GE are compared to assess the contribution of data augmentation. The NN that uses the original and the augmented drug-pair treatments as well as single-drug treatments outperforms NNs that ignore either the augmented drug-pairs or the single-drug treatments. In assessing the multimodal learning based on the MCC metric, MM-Net outperforms all the baselines. Our results show that data augmentation and integration of histology images with GE can improve prediction performance of drug response in PDXs.

  • Research Article
  • Cite Count Icon 3
  • 10.1360/tb-2020-0557
The application of artificial intelligence to drug sensitivity prediction
  • Jun 17, 2020
  • Chinese Science Bulletin
  • Xutong Li + 9 more

The development of computational methods for the prediction of effective therapeutic strategies based on the genomic information of patients is the main challenge of precision medicine. Since the 21st century, next-generation sequencing (NGS) has opened up new possibilities for personalized medicine. Extensive characterization at the molecular level for hundreds of cancer cell lines has been brought to the public eye by many organizations and agencies around the world. For example, the National Cancer Institute 60 Human Cancer Cell Line Screen (NCI-60), Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (GDSC) have provided large-scale omics data such as genomic, transcriptomic and epigenomic data characterizing cancer cell lines, and The Cancer Genome Atlas (TCGA) has molecularly characterized over 20000 primary cancers of patients. Combined with the drug response data of cancer cell lines, multiomics data could be used to analyse the mechanisms of action of anticancer drugs, which could be incorporated into precision medicine strategies. Over several decades, artificial intelligence (AI) technologies based on big data have revolutionized bioinformatics. AI has built a bridge between genomics and drug sensitivity by promoting the development of predictive models for the drug response of cancer cell lines. The 2012 NCI-DREAM drug prediction challenge has been particularly influential, as the innovative applications of machine learning that emerged from it have laid the groundwork for future studies. However, classic machine learning models are still challenging in terms of predictability because they limit the systematic integration of high-dimensional multiomics data. Therefore, network-based approaches, including link prediction and network representation, have become mainstream methods for drug response prediction. On the one hand, network-based approaches have not faced the “small n, large p” problem since the multiomics features are either represented in a gene/protein network or embedded in similarity networks between cell lines. On the other hand, the introduction of gene regulatory networks (GRNs) and protein-protein interactions (PPIs) into the predictive model can provide a functional background for the integration of genomic data and thereby improve the predictive performance of drug response. In addition to network-based approaches, multimodal deep learning models can systematically integrate multiomic data by considering them as different modalities. Generally, there are three feature fusion methods in deep neural networks: Input-level feature fusion (early fusion), intermediate feature fusion and decision-level fusion (late fusion). Intermediate feature fusion is predominant in drug response prediction studies, by which features are learned separately for each type of omics data and then integrated into one unified representation to be used as the input for a classifier or a regressor. Moreover, the features of drug structures can be used as a model to improve the performance. In brief, we summarize the characteristics of publicly accessible genomic databases and discuss the trends of artificial intelligence applications in drug sensitivity prediction for cancer cell lines, including machine learning, networks and multimodal deep neural networks.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 13
  • 10.3389/fbinf.2023.1164482
XMR: an explainable multimodal neural network for drug response prediction.
  • Aug 2, 2023
  • Frontiers in Bioinformatics
  • Zihao Wang + 4 more

Introduction: Existing large-scale preclinical cancer drug response databases provide us with a great opportunity to identify and predict potentially effective drugs to combat cancers. Deep learning models built on these databases have been developed and applied to tackle the cancer drug-response prediction task. Their prediction has been demonstrated to significantly outperform traditional machine learning methods. However, due to the "black box" characteristic, biologically faithful explanations are hardly derived from these deep learning models. Interpretable deep learning models that rely on visible neural networks (VNNs) have been proposed to provide biological justification for the predicted outcomes. However, their performance does not meet the expectation to be applied in clinical practice. Methods: In this paper, we develop an XMR model, an eXplainable Multimodal neural network for drug Response prediction. XMR is a new compact multimodal neural network consisting of two sub-networks: a visible neural network for learning genomic features and a graph neural network (GNN) for learning drugs' structural features. Both sub-networks are integrated into a multimodal fusion layer to model the drug response for the given gene mutations and the drug's molecular structures. Furthermore, a pruning approach is applied to provide better interpretations of the XMR model. We use five pathway hierarchies (cell cycle, DNA repair, diseases, signal transduction, and metabolism), which are obtained from the Reactome Pathway Database, as the architecture of VNN for our XMR model to predict drug responses of triple negative breast cancer. Results: We find that our model outperforms other state-of-the-art interpretable deep learning models in terms of predictive performance. In addition, our model can provide biological insights into explaining drug responses for triple-negative breast cancer. Discussion: Overall, combining both VNN and GNN in a multimodal fusion layer, XMR captures key genomic and molecular features and offers reasonable interpretability in biology, thereby better predicting drug responses in cancer patients. Our model would also benefit personalized cancer therapy in the future.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 27
  • 10.1186/s12859-021-04146-z
Super.FELT: supervised feature extraction learning using triplet loss for drug response prediction with multi-omics data
  • May 25, 2021
  • BMC Bioinformatics
  • Sejin Park + 2 more

BackgroundPredicting the drug response of a patient is important for precision oncology. In recent studies, multi-omics data have been used to improve the prediction accuracy of drug response. Although multi-omics data are good resources for drug response prediction, the large dimension of data tends to hinder performance improvement. In this study, we aimed to develop a new method, which can effectively reduce the large dimension of data, based on the supervised deep learning model for predicting drug response.ResultsWe proposed a novel method called Supervised Feature Extraction Learning using Triplet loss (Super.FELT) for drug response prediction. Super.FELT consists of three stages, namely, feature selection, feature encoding using a supervised method, and binary classification of drug response (sensitive or resistant). We used multi-omics data including mutation, copy number aberration, and gene expression, and these were obtained from cell lines [Genomics of Drug Sensitivity in Cancer (GDSC), Cancer Cell Line Encyclopedia (CCLE), and Cancer Therapeutics Response Portal (CTRP)], patient-derived tumor xenografts (PDX), and The Cancer Genome Atlas (TCGA). GDSC was used for training and cross-validation tests, and CCLE, CTRP, PDX, and TCGA were used for external validation. We performed ablation studies for the three stages and verified that the use of multi-omics data guarantees better performance of drug response prediction. Our results verified that Super.FELT outperformed the other methods at external validation on PDX and TCGA and was good at cross-validation on GDSC and external validation on CCLE and CTRP. In addition, through our experiments, we confirmed that using multi-omics data is useful for external non-cell line data.ConclusionBy separating the three stages, Super.FELT achieved better performance than the other methods. Through our results, we found that it is important to train encoders and a classifier independently, especially for external test on PDX and TCGA. Moreover, although gene expression is the most powerful data on cell line data, multi-omics promises better performance for external validation on non-cell line data than gene expression data. Source codes of Super.FELT are available at https://github.com/DMCB-GIST/Super.FELT.

  • Research Article
  • 10.1158/1538-7445.am2020-3913
Abstract 3913: Evaluation of patient-derived cell lines and cancer organoids for the prediction of drug responses in patient-derived xenograft models
  • Aug 13, 2020
  • Cancer Research
  • Petreena Campbell + 32 more

Cancer organoids are heterogeneous 3D cellular clusters with complexities that mimic some characteristics of tumors in situ. Thus, assays performed with cancer organoids might enable better predictions of in vivo drug responses than those performed with cell monolayers. The National Cancer Institute (NCI) is developing a national repository of Patient-Derived (PD) models comprised of clinically annotated and molecularly characterized PD xenografts (PDXs), PD tumor cell lines (PDCs), and PD cancer organoids (PDOrgs) (https://pdmr.cancer.gov/). We evaluated the therapeutic activity of a panel of FDA-approved and investigational anticancer agents, including carboplatin, gemcitabine, paclitaxel, SN38, 5-FU, adavosertib, erlotinib, trametinib, and vemurafenib, against a cohort of PDCs, PDOrgs, and PDXs from solid tumors including colon, gastroesophageal, head and neck, NSCLC, pancreatic, bladder, and uterine cancers. Our goal was to investigate whether drug sensitivities determined using PDCs and PDOrgs correlate with responses observed in the matching PDXs. Cultures were exposed to anticancer agents at concentrations ranging from 1 pM to 100 µM for periods of 4 or 6 days. The data indicated that the GI50 values for PDOrgs were in overall agreement with in vivo PDX drug responses measured as relative median to event free survival (RMEFS), where an event is the median time (days) from treatment initiation to tumor volume quadrupling, calculated as median time to tumor volume quadrupling for treated animals/median time to tumor volume quadrupling for control animals. For both paclitaxel and trametinib, responses in PDOrgs, from most sensitive to most resistant, were similar to the corresponding PDXs. Drug sensitivities determined in PDC monolayers were less clearly related to in vivo PDX responses; particularly for PDCs treated with carboplatin, gemcitabine, and SN-38. This work is part of a larger effort to provide a rigorous comparison between fully characterized and annotated PDCs-PDOrgs-PDXs to assess the value of different in vitro model systems for the prediction of PDX drug responses. This research was supported [in part] by the Developmental Therapeutics Program in the Division of Cancer Treatment and Diagnosis of the National Cancer Institute. Funded by NCI Contract No. HHSN261200800001E. Citation Format: Petreena Campbell, Curtis Hose, Lara El Touny, Erik Harris, John Connelly, Carrie Bonomi, Kelly Dougherty, Savanna Styers, Abigail Walke, Jenna Moyer, Mariah Baldwin, Anna Wade, Michael Mullendore, Kaitlyn Arthur, Matthew Murphy, Kevin Plater, Marion Gibson, Joseph Geraghty, Michelle Gottholm-Ahalt, Tara Grinnage-Pulley, Tiffanie Chase, John Carter, Howard Stotler, Debbie Trail, Luke Stockwin, Dianne Newton, Yvonne Evrard, Melinda Hollingshead, Ralph E. Parchment, Nathan P. Coussens, Beverly A. Teicher, James H. Doroshow, Annamaria Rapisarda. Evaluation of patient-derived cell lines and cancer organoids for the prediction of drug responses in patient-derived xenograft models [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 3913.

  • Research Article
  • 10.1158/1538-7445.sabcs19-p6-03-17
Abstract P6-03-17: Effect of histone deacetylase inhibitors on patient-derived neoadjuvant chemotherapy resistant triple negative breast cancer xenografts that represent understudied patients
  • Feb 14, 2020
  • Cancer Research
  • Margarite Matossian + 23 more

Triple negative breast cancers (TNBCs) are a clinically and biologically aggressive breast cancer (BC) subtype; TNBC tumors have higher rates of metastasis, relapse and acquired/inherent drug resistance. Incidence and mortality rates of TNBC are stratified based on patient ethnicity - patients with African ancestry have higher mortality rates and diagnoses of invasive cancers compared to patients representing other ethnicities. Louisiana has a high proportion of African-American residents (32.7% in 2018), and New Orleans has among the highest incidences of TNBC in the country. Many of our patients present with TNBC tumors that are partially or completely resistant to neoadjuvant chemotherapies. There are currently no clinically approved targeted therapies for TNBC. Current therapeutic discovery focused TNBC research does not aptly address the knowledge gap regarding ethnic disparity in TNBC incidence/mortality rates and TNBC biology. To date, most TNBC-related research and knowledge has been acquired from Caucasian patients, although patients with African and Hispanic ancestries represent the majority of TNBC cases. Patient-derived xenografts (PDXs) are extensively used in BC research, as they mimic complex microanatomy, oncoarchitecture, and cell-cell/cell-stroma interactions of tumors. Here, we demonstrated the unique composition of PDX tumors is not dramatically affected by serial transplantation in mice, based on molecular phenotypes (examined using qRT-PCR and RNA sequencing) and the oncoarchitecture of the extracellular matrix (based on cryogenic scanning electron microscopy). Using these models in basic research facilitates translation of laboratory findings to the clinical setting, and dramatically enhanced drug discovery research. We have established over twelve TNBC PDX models, 90% of which represent patients of African ancestry, and most of which are resistant to neoadjuvant regimens. We focus on dissecting and evaluating kinase inhibitor/targeted drug response to various individual components (tumor cell biology, stroma, immune, extracellular matrix) of chemotherapy resistant TNBC tumors. Histone deacetylase inhibitors (DACi) are a promising therapeutic agent in TNBC systems; they have been shown to suppress tumorigenesis and metastasis in TNBC through suppression of the mesenchymal phenotype in cell line-based studies. In this study we utilized various TNBC PDX models (TU-BcX-2K1, -2O0, 4IC, -4M4, -4QAN, -4QX) to assess these findings in more translational systems. Interestingly, we showed that DACi effect on tumorigenesis and metastasis varied depending on specific TNBC PDXs utilized. These data implicate specific genes/signaling pathways exist in individual patient tumors that can predict tumor responsiveness to DACi. Preliminary data using the NCI oncology drug set implicated the MEK1/2 pathway contributed to sensitization of TNBC cells. Furthermore, we found a disconnect in gene expressions that were previously shown to be affected by DACi therapy (CDH1, VIM, ZEB1, ZEB2) in various derivations of PDX models (cells, PDX-Os, ex vivo, in vivo). These findings demonstrate that testing various derivations of PDX models is crucial to parsing out specific mechanisms of targeted therapies. Our methods presented here to assess targeted drug response and drug resistance using PDX models can be applied to any area of cancer research and is not limited to breast cancer. Citation Format: Margarite Matossian, Steven Elliott, Maryl Wright, Tiffany Chang, Madlin Alzoubi, Henri Wathieu, Rachel Sabol, Alex Alfortish, Hope Burks, Van Hoang, Deniz Ucar, Gabrielle Windsor, Thomas Yan, Jovanny Zabaleta, Fokhrul Hossain, Bruce Bunnell, Krzysztof Moroz, Arnold Zea, Adam Riker, Steven Jones, Elizabeth Martin, Lucio Miele, Bridgette Collins-Burow, Matthew Burow. Effect of histone deacetylase inhibitors on patient-derived neoadjuvant chemotherapy resistant triple negative breast cancer xenografts that represent understudied patients [abstract]. In: Proceedings of the 2019 San Antonio Breast Cancer Symposium; 2019 Dec 10-14; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2020;80(4 Suppl):Abstract nr P6-03-17.

  • Research Article
  • Cite Count Icon 41
  • 10.1186/s12859-022-04964-9
Deep learning and multi-omics approach to predict drug responses in cancer
  • Nov 28, 2022
  • BMC Bioinformatics
  • Conghao Wang + 4 more

BackgroundCancers are genetically heterogeneous, so anticancer drugs show varying degrees of effectiveness on patients due to their differing genetic profiles. Knowing patient’s responses to numerous cancer drugs are needed for personalized treatment for cancer. By using molecular profiles of cancer cell lines available from Cancer Cell Line Encyclopedia (CCLE) and anticancer drug responses available in the Genomics of Drug Sensitivity in Cancer (GDSC), we will build computational models to predict anticancer drug responses from molecular features.ResultsWe propose a novel deep neural network model that integrates multi-omics data available as gene expressions, copy number variations, gene mutations, reverse phase protein array expressions, and metabolomics expressions, in order to predict cellular responses to known anti-cancer drugs. We employ a novel graph embedding layer that incorporates interactome data as prior information for prediction. Moreover, we propose a novel attention layer that effectively combines different omics features, taking their interactions into account. The network outperformed feedforward neural networks and reported 0.90 for R^2 values for prediction of drug responses from cancer cell lines data available in CCLE and GDSC.ConclusionThe outstanding results of our experiments demonstrate that the proposed method is capable of capturing the interactions of genes and proteins, and integrating multi-omics features effectively. Furthermore, both the results of ablation studies and the investigations of the attention layer imply that gene mutation has a greater influence on the prediction of drug responses than other omics data types. Therefore, we conclude that our approach can not only predict the anti-cancer drug response precisely but also provides insights into reaction mechanisms of cancer cell lines and drugs as well.

  • Research Article
  • 10.1158/1538-7445.am2018-2175
Abstract 2175: Considerations in PDX mouse trial design and their relevance to human clinical trial outcomes
  • Jul 1, 2018
  • Cancer Research
  • Jingjing Jiang + 7 more

Using cancer models to validate drug targets, evaluate drug candidates, and support clinical trial design has been important parts of preclinical studies in cancer drug research. To translate cancer model studies into clinical studies, great efforts have been made to generate a large number of patient derived xenograft (PDX) tumor models in certain cancer types and to demonstrate their similarities to cancer patients in tumor growth, histopathology, tumor complexity, molecular features and drug responses. Recently, focus has been shifted to use cancer model populations to mimic clinical trial design and predict drug responses in clinical trials. We have developed over 1200 PDX models in multiple cancer types from naive or relapse tumor samples. Genomic profile and hotspot mutation analyses were performed to characterize drug targets and biomarkers used in clinical settings. Chemotherapies such as taxane and platinum, and targeted drugs such as cabozantinib, olaparib or sorafenib were tested at different doses and durations in PDX models such as lung cancer, gastric cancer or liver cancer. Drug response results from different regimens in PDX studies were analyzed by mRECIST method and compared with the corresponding results from clinical trials. Our results demonstrated that selection of PDX models with histopathology and genetic features matched to the corresponding patient population in clinical trials is important for treatment result prediction. Some widely used doses for chemos in preclinical studies need to be reduced to achieve consistency with clinical results. Longer treatment time and more models than those normally used in preclinical efficacy studies also improve prediction value especially in cancer types with higher heterogeneity. Overall benefits of a targeted drug combined with one chemo over its combination with another chemo can be more accurately reflected in a large PDX population. In contrast PDX models derived from naive patient samples showed not much difference from models derived from chemo resistant tumors in their responses to new targeted treatments. Drugs targeting RAS/RAF signaling, PI3K/AKT signaling or cell cycle showed more uncertainty in PDX models if single biomarkers were used for drug response prediction. In summary, a sufficient number of PDX models with pathological and molecular features similar to compositions of human cancer patients in clinical trials are necessary for using PDX mouse trial in predicting clinical outcome. Considerations should be given to mouse trial design similar to clinical trial design rather than traditional preclinical studies for targeting validation or proof-of-concept efficacy tests. Citation Format: Jingjing Jiang, Ying Yan, Tingting Tan, Wei Du, Jiali Gu, Ling Qiu, Katherine Ye, Zhenyu Gu. Considerations in PDX mouse trial design and their relevance to human clinical trial outcomes [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 2175.

  • Preprint Article
  • Cite Count Icon 1
  • 10.69622/26053291
Statistical and computational methodologies for omics data analyses and drug response prediction
  • Sep 18, 2024
  • Quang Thinh Trac

<p dir="ltr">With the availability of valuable omics data from recent high-throughput sequencing technologies, and a deeper understanding of the pathophysiology of multiple diseases, researchers can now focus on precision medicine to improve the effectiveness of current diagnosis and treatment methods. Unlike traditional treatment, which was largely subjective and based on clinicians' experience, modern treatment for complex diseases can be guided by the precision medicine approach, such as through the molecular classification of diseases of patients. However, despite the early promising outcomes of precision medicine, analyzing omics data to tailor effective treatments for patients and explore the biological mechanisms of diseases remains highly challenging. This is not only due to the heterogeneity of the disease but also the complexity of the omics data.</p><p dir="ltr">In this thesis, we aim to develop statistical and computational methodologies for multi-omics data analyses and drug response prediction. The methodologies are applied for both simulated and real datasets from different diseases, with a particular focus on acute myeloid leukemia (AML) and amyotrophic lateral sclerosis (ALS). Through critical evaluation and validation analyses, we demonstrate that our methods perform well against competing methods.</p><p dir="ltr">In study I, we propose a pathway activation score (PAS) and apply it to identify and validate druggable cancer-specific pathways (DCSP) from pan-cancer datasets. Our hypothesis is that cancers with activated DCSPs are more likely to respond to the corresponding drug. In analysis, we identified and validated 4,794 DCSPs across 23 cancers. Further focusing on AML, we show that tumor samples with higher PAS exhibit stronger drug responses, supporting our hypothesis.</p><p dir="ltr">In study II, we develop MDREAM, a prediction model for drug response in AML patients. We first train MDREAM on the BeatAML cohort using gene expression, mutation profiles, and drug response data. We further validate MDREAM in the test set of the BeatAML dataset and externally validate it in a Swedish AML dataset and a relapsed leukemia dataset. Our results demonstrate the robust and consistent performance of MDREAM across datasets. We also propose a confidence score metric to compute prediction uncertainty and illustrate its application within the MDREAM framework.</p><p dir="ltr">In study III, we implement DIPx, a machine learning model for personalized drug synergy prediction based on PAS. DIPx is trained and validated using the AstraZeneca-Sanger (AZS) DREAM Challenge dataset. Our validation results show that DIPx achieves higher accuracy than the top-performing method from the challenge. Additionally, we demonstrate how PAS can suggest potential biological mechanisms by identifying activated pathways that mediate drug synergy interactions.</p><p dir="ltr">In study IV, we introduce MegaFun, a computational method for quantifying the functional aspects of the microbiome from metagenomics data. MegaFun utilizes gene clusters based on sequence similarities at both the pangenome and isolate levels. To quantify functional abundance, it employs an alternating EM algorithm which is applied to a bilinear model capturing the complexity of the microbiome at the isolate level. In a simulated dataset, MegaFun outperforms HUMAN, a state-of-the-art method for functional quantification. We also apply MegaFun to analyze a real metagenomics dataset from ALS patients.</p><p dir="ltr">In summary, we have developed novel statistical and computational methods to analyze omics data and responses of drugs. The results demonstrate that these methods perform well against existing methodologies. We hope that our work will advance omics data analysis and drug response prediction, and aid researchers in uncovering biological insights.</p><h3>List of scientific papers</h3><p dir="ltr">I. <b>Quang Thinh Trac</b>, Tingyou Zhou, Yudi Pawitan, and Trung Nghia Vu. Discovery of druggable cancer-specific pathways with application in acute myeloid leukemia. Gigascience. 11:giac091 (2022).<br><a href="https://doi.org/10.1093/gigascience/giac091">https://doi.org/10.1093/gigascience/giac091</a><br></p><p dir="ltr"><br></p><p dir="ltr">II. <b>Quang Thinh Trac</b>, Yudi Pawitan, Tian Mou, Tom Erkers, Päivi Östling, Anna Bohlin, Albin Österroos, Mattias Vesterlund, Rozbeh Jafari, Ioannis Siavelis, Helena Bäckvall, Santeri Kiviluoto, Lukas M. Orre, Mattias Rantalainen, Janne Lehtio, Sören Lehmann, Olli Kallioniemi, and Trung Nghia Vu. Prediction model for drug response of acute myeloid leukemia patients. npj Precis. Onc. 7, 32 (2023).<br><a href="https://doi.org/10.1038/s41698-023-00374-z">https://doi.org/10.1038/s41698-023-00374-z</a><br></p><p dir="ltr"><br></p><p dir="ltr">III. <b>Quang Thinh Trac</b>*, Yue Huang*, Tom Erkers, Päivi Östling, Anna Bohlin, Albin Osterroos, Mattias Vesterlund, Rozbeh Jafari, loannis Siavelis, Helena Bäckvall, Santeri Kiviluoto, Lukas M. Orre, Mattias Rantalainen, Janne Lehtio, Sören Lehmann, Olli Kallioniemi, Yudi Pawitan and Trung Nghia Vu. Pathway activation model for personalized prediction of drug synergy. eLife13:RP100071 (2024). (* Contributed equally)<br><a href="https://doi.org/https://doi.org/10.7554/eLife.100071.1">https://doi.org/10.7554/eLife.100071.1</a><br></p><p dir="ltr"><br></p><p dir="ltr">IV. <b>Quang Thinh Trac</b>, Emily Joyce, Fredrik Boulund, Fang Fang, Yudi Pawitan and Trung Nghia Vu. Functional quantification of microbiome from metagenomics. [Manuscript]</p>

  • Preprint Article
  • 10.69622/26053291.v1
Statistical and computational methodologies for omics data analyses and drug response prediction
  • Sep 18, 2024
  • Quang Thinh Trac

<p dir="ltr">With the availability of valuable omics data from recent high-throughput sequencing technologies, and a deeper understanding of the pathophysiology of multiple diseases, researchers can now focus on precision medicine to improve the effectiveness of current diagnosis and treatment methods. Unlike traditional treatment, which was largely subjective and based on clinicians' experience, modern treatment for complex diseases can be guided by the precision medicine approach, such as through the molecular classification of diseases of patients. However, despite the early promising outcomes of precision medicine, analyzing omics data to tailor effective treatments for patients and explore the biological mechanisms of diseases remains highly challenging. This is not only due to the heterogeneity of the disease but also the complexity of the omics data.</p><p dir="ltr">In this thesis, we aim to develop statistical and computational methodologies for multi-omics data analyses and drug response prediction. The methodologies are applied for both simulated and real datasets from different diseases, with a particular focus on acute myeloid leukemia (AML) and amyotrophic lateral sclerosis (ALS). Through critical evaluation and validation analyses, we demonstrate that our methods perform well against competing methods.</p><p dir="ltr">In study I, we propose a pathway activation score (PAS) and apply it to identify and validate druggable cancer-specific pathways (DCSP) from pan-cancer datasets. Our hypothesis is that cancers with activated DCSPs are more likely to respond to the corresponding drug. In analysis, we identified and validated 4,794 DCSPs across 23 cancers. Further focusing on AML, we show that tumor samples with higher PAS exhibit stronger drug responses, supporting our hypothesis.</p><p dir="ltr">In study II, we develop MDREAM, a prediction model for drug response in AML patients. We first train MDREAM on the BeatAML cohort using gene expression, mutation profiles, and drug response data. We further validate MDREAM in the test set of the BeatAML dataset and externally validate it in a Swedish AML dataset and a relapsed leukemia dataset. Our results demonstrate the robust and consistent performance of MDREAM across datasets. We also propose a confidence score metric to compute prediction uncertainty and illustrate its application within the MDREAM framework.</p><p dir="ltr">In study III, we implement DIPx, a machine learning model for personalized drug synergy prediction based on PAS. DIPx is trained and validated using the AstraZeneca-Sanger (AZS) DREAM Challenge dataset. Our validation results show that DIPx achieves higher accuracy than the top-performing method from the challenge. Additionally, we demonstrate how PAS can suggest potential biological mechanisms by identifying activated pathways that mediate drug synergy interactions.</p><p dir="ltr">In study IV, we introduce MegaFun, a computational method for quantifying the functional aspects of the microbiome from metagenomics data. MegaFun utilizes gene clusters based on sequence similarities at both the pangenome and isolate levels. To quantify functional abundance, it employs an alternating EM algorithm which is applied to a bilinear model capturing the complexity of the microbiome at the isolate level. In a simulated dataset, MegaFun outperforms HUMAN, a state-of-the-art method for functional quantification. We also apply MegaFun to analyze a real metagenomics dataset from ALS patients.</p><p dir="ltr">In summary, we have developed novel statistical and computational methods to analyze omics data and responses of drugs. The results demonstrate that these methods perform well against existing methodologies. We hope that our work will advance omics data analysis and drug response prediction, and aid researchers in uncovering biological insights.</p><h3>List of scientific papers</h3><p dir="ltr">I. <b>Quang Thinh Trac</b>, Tingyou Zhou, Yudi Pawitan, and Trung Nghia Vu. Discovery of druggable cancer-specific pathways with application in acute myeloid leukemia. Gigascience. 11:giac091 (2022).<br><a href="https://doi.org/10.1093/gigascience/giac091">https://doi.org/10.1093/gigascience/giac091</a><br></p><p dir="ltr"><br></p><p dir="ltr">II. <b>Quang Thinh Trac</b>, Yudi Pawitan, Tian Mou, Tom Erkers, Päivi Östling, Anna Bohlin, Albin Österroos, Mattias Vesterlund, Rozbeh Jafari, Ioannis Siavelis, Helena Bäckvall, Santeri Kiviluoto, Lukas M. Orre, Mattias Rantalainen, Janne Lehtio, Sören Lehmann, Olli Kallioniemi, and Trung Nghia Vu. Prediction model for drug response of acute myeloid leukemia patients. npj Precis. Onc. 7, 32 (2023).<br><a href="https://doi.org/10.1038/s41698-023-00374-z">https://doi.org/10.1038/s41698-023-00374-z</a><br></p><p dir="ltr"><br></p><p dir="ltr">III. <b>Quang Thinh Trac</b>*, Yue Huang*, Tom Erkers, Päivi Östling, Anna Bohlin, Albin Osterroos, Mattias Vesterlund, Rozbeh Jafari, loannis Siavelis, Helena Bäckvall, Santeri Kiviluoto, Lukas M. Orre, Mattias Rantalainen, Janne Lehtio, Sören Lehmann, Olli Kallioniemi, Yudi Pawitan and Trung Nghia Vu. Pathway activation model for personalized prediction of drug synergy. eLife13:RP100071 (2024). (* Contributed equally)<br><a href="https://doi.org/https://doi.org/10.7554/eLife.100071.1">https://doi.org/10.7554/eLife.100071.1</a><br></p><p dir="ltr"><br></p><p dir="ltr">IV. <b>Quang Thinh Trac</b>, Emily Joyce, Fredrik Boulund, Fang Fang, Yudi Pawitan and Trung Nghia Vu. Functional quantification of microbiome from metagenomics. [Manuscript]</p>

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 70
  • 10.1038/s41598-020-58821-x
RefDNN: a reference drug based neural network for more accurate prediction of anticancer drug resistance
  • Feb 5, 2020
  • Scientific Reports
  • Jonghwan Choi + 2 more

Cancer is one of the most difficult diseases to treat owing to the drug resistance of tumour cells. Recent studies have revealed that drug responses are closely associated with genomic alterations in cancer cells. Numerous state-of-the-art machine learning models have been developed for prediction of drug responses using various genomic data and diverse drug molecular information, but those methods are ineffective to predict drug response to untrained drugs and gene expression patterns, which is known as the cold-start problem. In this study, we present a novel deep neural network model, termed RefDNN, for improved prediction of drug resistance and identification of biomarkers related to drug response. RefDNN exploits a collection of drugs, called reference drugs, to learn representations for a high-dimensional gene expression vector and a molecular structure vector of a drug and predicts drug response labels using the reference drug-based representations. These calculations come from the observation that similar chemicals have similar effects. The proposed model not only outperformed existing computational prediction models in most comparative experiments, but also showed more robust prediction for untrained drugs and cancer types than traditional machine learning models. RefDNN exploits the ElasticNet regularization to deal with high-dimensional gene expression data, which allows identification of gene markers associated with drug resistance. Lastly, we described an application of RefDNN in exploring a new candidate drug for liver cancer. As the proposed model can guarantee good prediction of drug responses to untrained drugs for given gene expression patterns, it may be of potential benefit in drug repositioning and personalized medicine.

  • Research Article
  • Cite Count Icon 15
  • 10.1093/bioinformatics/btad734
MMCL-CDR: enhancing cancer drug response prediction with multi-omics and morphology images contrastive representation learning
  • Dec 1, 2023
  • Bioinformatics
  • Yang Li + 3 more

MotivationCancer is a complex disease that results in a significant number of global fatalities. Treatment strategies can vary among patients, even if they have the same type of cancer. The application of precision medicine in cancer shows promise for treating different types of cancer, reducing healthcare expenses, and improving recovery rates. To achieve personalized cancer treatment, machine learning models have been developed to predict drug responses based on tumor and drug characteristics. However, current studies either focus on constructing homogeneous networks from single data source or heterogeneous networks from multiomics data. While multiomics data have shown potential in predicting drug responses in cancer cell lines, there is still a lack of research that effectively utilizes insights from different modalities. Furthermore, effectively utilizing the multimodal knowledge of cancer cell lines poses a challenge due to the heterogeneity inherent in these modalities.ResultsTo address these challenges, we introduce MMCL-CDR (Multimodal Contrastive Learning for Cancer Drug Responses), a multimodal approach for cancer drug response prediction that integrates copy number variation, gene expression, morphology images of cell lines, and chemical structure of drugs. The objective of MMCL-CDR is to align cancer cell lines across different data modalities by learning cell line representations from omic and image data, and combined with structural drug representations to enhance the prediction of cancer drug responses (CDR). We have carried out comprehensive experiments and show that our model significantly outperforms other state-of-the-art methods in CDR prediction. The experimental results also prove that the model can learn more accurate cell line representation by integrating multiomics and morphological data from cell lines, thereby improving the accuracy of CDR prediction. In addition, the ablation study and qualitative analysis also confirm the effectiveness of each part of our proposed model. Last but not least, MMCL-CDR opens up a new dimension for cancer drug response prediction through multimodal contrastive learning, pioneering a novel approach that integrates multiomics and multimodal drug and cell line modeling.Availability and implementationMMCL-CDR is available at https://github.com/catly/MMCL-CDR.

  • Research Article
  • 10.1158/1538-7445.am2013-2779
Abstract 2779: Establishment and molecular characterization of a panel of Asian patient-derived tumor xenograft models .
  • Apr 15, 2013
  • Cancer Research
  • Xiaoran Qin + 13 more

Rodent tumor models with histological and molecular resemblance of human tumors and improved predictive value for clinical drug response are highly desired for oncology drug discovery and development. Patient-derived xenograft (PDX) tumor models are believed to better preserve features of human malignancy than cancer cell line-derived xenograft models. We have established more than 170 PDX models from Asian-prevalent human tumors in SCID or nude mice. Here we report molecular and pharmacological profiling of a panel of pancreatic adenocarcinoma (PAC) and hepatocellular carcinoma (HCC) models. Among 13 PAC models, Sequenom analysis showed KRAS mutation in all models, p53 mutation in 3, p16/CDKN2A deletion in 9 and SMAD4 deletion in 4 models. Comparison of transcriptomes by Affymetrix U133Plus2 between the original tumor and derived xenografts at different passages (P1-P6) in one PAC model revealed a high degree of similarity (R2 =0.92-0.97). Also, the gene mutational status and histological characters remained unchanged across the parental tumor and different passages of the PDX model. Evaluation of response to Gemzar treatment (60 mpk, Q4D x 3) showed significant tumor regression in 6 PAC models and partial tumor growth inhibition in 4 PAC models. Notably, the regression cohort and the partial response cohort displayed differential gene expression patterns. In addition, primary tumor cells derived from PAC models via tissue digestion were tested in vitro with Gemzar. The in vitro sensitivity to Gemzar of derived cells correlated with in vivo response of the parental PAC models. Gene expression analysis in 11 HCC models found aberrant gene expression involving several signaling pathways such as WNT, EGF/IGF and TGF-β, which recapitulate the alteration of these pathways in human HCC. These HCC PDX models exhibited major features of three HCC subclasses defined by the study in human HCC: S1 with activation of WNT and TGF-β pathways, S2 with enriched AKT and MYC activation and S3 with β-catenin activation. The response of these models to Sorafenib was also examined and varying degrees of sensitivity to Sorafenib were observed. Together, we have successfully generated PDX tumor models that preserve the histological and biological properties of the original human tumor and represent a heterogeneous patient population with varying degree of response to current stand of care. These models are a relevant and powerful tool for evaluation of anticancer therapeutics. Whole genome sequencing and gene expression profiling by RNAseq on PDX models are being preformed. Citation Format: Xiaoran Qin, Zhonghua Tang, Gang Hu, Kedong Ouyang, Ke Wang, Fu Li, Fubo Xie, Qiuming Pan, Min Shi, Gang Zhao, Yixin Zhang, Chunchao Zhu, Danyi Wen, Weikang Tao. Establishment and molecular characterization of a panel of Asian patient-derived tumor xenograft models . [abstract]. In: Proceedings of the 104th Annual Meeting of the American Association for Cancer Research; 2013 Apr 6-10; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2013;73(8 Suppl):Abstract nr 2779. doi:10.1158/1538-7445.AM2013-2779

  • Research Article
  • Cite Count Icon 1
  • 10.1158/1538-7445.am2015-320
Abstract 320: Perfused 3D tri-culture breast cancer microtumors for accurate prediction of drug response
  • Aug 1, 2015
  • Cancer Research
  • Tessa M Desrochers + 9 more

Background: Breast cancer (BC) occurs in 1 of 8 women, often requiring debilitating surgery, chemotherapy or radiation for long term survival. Histologic and molecular biomarkers are used to classify BC according to defined subtypes which dictate the choice of targeted therapy or of non-targeted cytotoxic therapy. Despite high initial response rates, relapses are common for more aggressive tumors, and choosing the right therapy for each patient remains challenging. In vitro 3D BC models maintain biologic features that more closely resemble clinical disease than 2D models. However, many 3D models do not contain multiple cell types, are maintained in static culture conditions and rely on immortalized cell lines previously propagated in 2D culture conditions. To address these issues, we developed long term, 3D heterotypic BC microtumors, which recapitulate the dynamic interaction between stromal and epithelial components, retain subtype-specific biomarkers and demonstrate clinically-relevant drug response. We further demonstrated the value of developing non-lytic, label-free in situ analysis to monitor morphology and function of complex 3D microtumors over time. Materials & Methods: Er+, Her2+ or triple negative (TNBC) cell lines (MCF7, SKBR3, MDA-MB-231) or patient derived xenograft (PDX) cells were embedded with human mammary fibroblasts and adipose cells within a hydrogel encapsulated by a silk fibroin scaffold. Microtumors were maintained at least 4 weeks under perfusion flow utilizing the 3DKUBE™ and were characterized for cell morphology and phenotype (IHC), proliferation (PrestoBlue and PicoGreen), gene expression (qRT-PCR), redox ratio (multiphoton microscopy), and biomarker secretion (xMAP® multiplex immunoassay). Drug response profiling (DRP) was performed with tamoxifen, lapatinib and cisplatin. Results: 3D microtumors successfully recapitulated the morphology of primary BC predicted by molecular subtype and gene expression. Perfusion promoted cell proliferation and impacted redox ratio, gene expression, and biomarker secretion in comparison to static culture. Relative redox ratios of 3D microtumors were significantly different from those of cell lines in 2D (p<0.05). Perfusion, 3D conditions, Her2+ and TNBCs were independently associated with increased biomarker secretion, and both cell line and PDX microtumors had unique secretome signatures. PDX microtumors more accurately predicted drug response. Conclusions: Long-term, 3D heterotypic breast microtumors have unique metabolic and secretome signatures which are different than cells in 2D, and the microtumor morphology, metabolism and drug response can be monitored non-destructively in situ. Our ultimate goal is to develop these microtumors using primary human breast tumors for real time drug response profiling in the preclinical, co-clinical and clinical settings to improve outcomes for women with breast cancer. Citation Format: Tessa M. DesRochers, Stephen Shuford, Christina Mattingly, Terri Bruce, Zhiyi Liu, Kyle Quinn, Irene Georgakoudi, David L. Kaplan, David Orr, Howland E. Crosswell. Perfused 3D tri-culture breast cancer microtumors for accurate prediction of drug response. [abstract]. In: Proceedings of the 106th Annual Meeting of the American Association for Cancer Research; 2015 Apr 18-22; Philadelphia, PA. Philadelphia (PA): AACR; Cancer Res 2015;75(15 Suppl):Abstract nr 320. doi:10.1158/1538-7445.AM2015-320

  • Book Chapter
  • Cite Count Icon 3
  • 10.1007/978-981-13-6508-9_35
Integrating Heterogeneous Datasets by Using Multimodal Deep Learning
  • Jun 14, 2019
  • Fariba Khoshghalbvash + 1 more

Rapid collection of data sources, varying in volume and structure poses a challenge for scientists to establish a practical approach to manipulating heterogeneous data sources. A multimodal learning and an integrated analysis make it possible to extract much worthwhile information from a collection of multiple simple raw data. Therefore, data integration can lead to a more reliable and robust result. High-throughput sequencing technologies, especially next-generation sequencing, leave us with multi-platform genomic data such as gene expression, SNP, CNV, DNA methylation, and miRNA expression. In this paper, we represented a multimodal deep neural network to exploit the mutual information between three different modalities to classify breast cancer patients into two groups based on their survival rate. Experimental results indicate that our method improves the classification accuracy and performs better on imbalanced data compared to the other single-modal state-of-the-art methods.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant