Abstract Single cell RNA sequencing (scRNA seq) has become a prominent tool towards understanding models of prostate cancer. However, there are limited approaches geared towards collating these experiments into a unified framework for understanding discoveries across studies. In our lab, we have performed scRNA seq on multiple organoid models, across a spectrum of phenotypes including amphicrine, adenocarcinoma, and neuroendocrine disease, as well as performed lineage barcode tracing in a subset of these experiments as a means of identifying lineage transitions. Additionally, we have scRNA seq on a robust patient-derived xenograft (PDX) model of metastasis, consisting of experiments from primary and metastatic PDX models collected from multiple tissues including both intact and castrated backgrounds. Along with data generated by our lab, we accessed published scRNA seq studies from other labs which broadened the pool of experiments to include data from cell lines and multiple single cell analysis platforms. In order to perform computational molecular phenotyping of the scRNA seq experiments, we curated gene sets from the literature and applied principal components analysis (PCA) to them as our molecular phenotyping scoring approach which is widely used by previous studies. Because of the diverse background of the studies being incorporated into our molecular phenotyping, we used the bioconductor stack of single cell analysis tools to perform quality control and removal of batch effects. We observe that molecular phenotyping of samples across different backgrounds shows common populations typically associated with proliferation and stemness, and that unique populations present in each sample are usually more associated with terminal differentiation and senescence. In addition to identifying common and unique populations across samples and platforms, we also can assess individual genes on their importance in the molecular phenotyping approach. The use of PCA to generate our molecular phenotyping score allows the rotation matrix to be used as an additional metric when ranking genes for biological importance in concert with commonly used metrics such as variability and deviance. When combining the global metrics with the rotation matrix for each gene set, genes can be arranged in a network to determine key genes for molecular phenotyping, allowing for more concise gene sets that decrease the computational overhead for performing the molecular phenotyping. Overall, our work suggests that a unified framework of analysis can yield new insights into scRNA seq studies by identifying common populations present across sample backgrounds and different scRNA seq technologies. Additionally, genes can be arranged into networks based on biological significance derived from global statistical analysis as well as molecular phenotyping to identify more concise gene signatures, decreasing computational overhead and highlighting genes with excessive contribution to biological state. Citation Format: Brian Capaldo, Michael Beshiri, Juan Yin, Kathy Kelly. Molecular phenotyping in single cell RNA sequencing allows for identification of common populations across studies and platforms [abstract]. In: Proceedings of the AACR Special Conference: Advances in Prostate Cancer Research; 2023 Mar 15-18; Denver, Colorado. Philadelphia (PA): AACR; Cancer Res 2023;83(11 Suppl):Abstract nr A066.
Read full abstract