Abstract

Abstract Identifying how cell types and their abundances evolve during tumor progression is critical to understanding the mechanisms and identifying predictors of metastasis. Single-cell RNA sequencing (scRNA-seq) has been especially promising in resolving heterogeneity of expression programs at the single cell level but is not always available, for example for large cohort studies or longitudinal analysis of archived samples. In such cases, cell subpopulations must be inferred by deconvolution, a process that can infer single-cell genomic data from bulk data but has limited ability to resolve fine clonal structure. We extend our previous bulk genomic deconvolution tool, Robust and Accurate Deconvolution (RAD), to establish a new method, scRAD, that can use reference scRNA-seq to interpret sample collections for which only bulk RNA-seq is available for some samples, e.g., clonally resolving archived primary (PRM) tissues using scRNA-seq from metastases (METs). We preprocess scRNA-seq data to accurately represent gene expression profiles, yielding a signature matrix S then extend our RAD method via a regularization term to deconvolve bulk data while maximizing consistency with S. We validate our method on semi-synthetic data derived from human PRM breast cancer cases and bone and ovary METs, showing that scRAD improves inference of single-cell gene expression profiles and their frequencies relative to the prior RAD with random initialization or initialization using the single-cell matrix S (Table 1). We then apply scRAD to a collection of paired PRM and MET tumors to quantify progression changes in common cell types. One-sided Kaplan-Meier analysis shows that tumors inferred to increase the mast cell fraction from PRM to MET exhibit lower overall survival (p<0.05), consistent with the role of mast cells in metastatic growth and propagation. Tumors that show increased macrophage cell fraction from PRM to MET show improved overall survival (p<0.04), consistent with the role of immune infiltration in survival. mean square error (MSE) of gene expression and mixture fraction inference on semi-simulated data Method RAD with random initialization RAD with random initialization RAD with random initialization RAD initialized with S RAD initialized with S RAD initialized with S scRAD scRAD scRAD Sample number 2 4 8 2 4 8 2 4 8 Gene Expression MSE 0.41 0.28 0.37 0.31 0.27 0.37 0.22 0.15 0.13 Mixture fraction MSE 0.58 0.82 0.59 0.39 0.83 0.58 0.19 0.20 0.22 Citation Format: Haoyun Lei, Xiaoyan A. Guo, Yifeng Tao, Kai Ding, Xuecong Fu, Steffi Oesterreich, Adrian V. Lee, Russell Schwartz. Improved deconvolution of combined bulk and single-cell RNA-sequencing data [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 5031.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call