Abstract

3627 Background: Differential gene expression (DGE) methods, initially developed for analyzing bulk RNA changes in pure tumor cell lines under experimental settings, are commonly used to identify biomarkers in and infer biological differences between patient tumor samples, which are admixtures of tumor and non-tumor components. Methods to sensitively and accurately detect cell type-specific expression differences in admixed patient samples are not well characterized but may greatly affect emerging targeted and immunotherapy biomarker strategies. To address this issue, we developed a simulation framework to benchmark our ability to detect changes in tumor-intrinsic gene expression. Methods: Pseudobulk RNAseq melanoma cohorts were simulated by sampling from melanoma single cell RNAseq data. Simulation parameters were optimized to maximize concordance of gene expression means and variances (Spearman r = 0.81, 0.68, respectively) between the TCGA SKCM cohort (n = 462) and matched simulated cohort, and then validated in two independent melanoma cohorts (n = 42, 129; means Spearman r = 0.80, 0.78; variances Spearman r = 0.68, 0.63). Using this simulation framework, we benchmarked the effect of sample size, magnitude of differential expression, and differences in cell type proportions on the sensitivity and positive predictive value (PPV) of detecting true differentially expressed genes in the tumor-intrinsic compartment. Results: Reference cohorts of 50 total tumors (n = 10) were simulated to contain a 2 standard deviation tumor-intrinsic expression change in 50 randomly selected genes and a 11% difference in mean purity between two equally sized 25-tumor subgroups. DGE analysis using DESeq2 with an FDR q-value threshold of 0.1 yielded a sensitivity of 0.37 and PPV of 0.29. DGE analysis of the same simulated cohorts using a non-parametric Mann-Whitney U test with an FDR q-value threshold of 0.1 yielded a sensitivity of 0.13 and PPV of 0.76. Conclusions: Commonly used DGE methods for existing expression-based biomarker strategies have poor sensitivity and PPV in admixed tumor samples, limiting our ability to find meaningful transcriptional biomarkers in clinical cohorts. We are currently developing methods to more accurately detect true differentially expressed genes in admixed bulk RNAseq samples and applying these approaches for biomarker discovery in immunotherapy-treated patient cohorts and other clinical tumor cohorts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call