Abstract

Abstract Gene Set Enrichment Analysis (GSEA) is a method for quantifying the activation of pathways and processes in gene expression data. GSEA works by ranking genes and testing if genes in annotated gene sets representing molecular pathways are overrepresented at the top or bottom of the ranked list. Standard GSEA assumes a gene-by-sample matrix with samples falling in two phenotype class labels and ranks genes based on correlation of expression and sample class labels. Single sample (ssGSEA) uses a similar approach to enrichment scoring but ranks genes in a single sample based on their expression. There has been recent interest in using the ssGSEA modality to measure pathway activity in individual cells from single cell RNA-sequencing (scRNA-seq) data. However, we have observed that ssGSEA scores in scRNA-seq cells are prone to uncertainty due to the characteristic sparsity of scRNA-seq data. This is because sparse data leads to large groups of tied genes, that is, genes with equal expression and subsequently equivalent positions in the ranked list. Because ssGSEA, which was designed for bulk gene expression technologies which yield much less sparse data, breaks ties arbitrarily, these ties can lead to variability in enrichment scores that is purely algorithmic and not a result of the underlying biology. To address this issue, we performed benchmarking to characterize the stability of scores using different enrichment scoring statistics, normalization methods, and aggregation of similar single cells. We have implemented the best practices identified using this analysis as Single Cell GSEA (scGSEA), a software tool which we will make freely available to researchers as open-source software to enable robust gene set enrichment scoring in scRNA-seq data. Citation Format: Alexander T. Wenzel, John Jun, Jill P. Mesirov. Adapting gene set enrichment analysis to single cell data [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 2343.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call