Abstract
Differential expression (DE) analysis and gene set enrichment (GSE) analysis are commonly applied in single cell RNA sequencing (scRNA-seq) studies. Here, we develop an integrative and scalable computational method, iDEA, to perform joint DE and GSE analysis through a hierarchical Bayesian framework. By integrating DE and GSE analyses, iDEA can improve the power and consistency of DE analysis and the accuracy of GSE analysis. Importantly, iDEA uses only DE summary statistics as input, enabling effective data modeling through complementing and pairing with various existing DE methods. We illustrate the benefits of iDEA with extensive simulations. We also apply iDEA to analyze three scRNA-seq data sets, where iDEA achieves up to five-fold power gain over existing GSE methods and up to 64% power gain over existing DE methods. The power gain brought by iDEA allows us to identify many pathways that would not be identified by existing approaches in these data.
Highlights
Differential expression (DE) analysis and gene set enrichment (GSE) analysis are commonly applied in single cell RNA sequencing studies
An overview of iDEA is described in Methods, with technical details provided in Supplementary Notes 1 and 2
IDEA examines one gene set at a time, performs inference through an expectation maximization algorithm, and uses Louis method[23] to compute a calibrated p-value testing whether the gene set is enriched in DE genes or not
Summary
Differential expression (DE) analysis and gene set enrichment (GSE) analysis are commonly applied in single cell RNA sequencing (scRNA-seq) studies. Effective analysis of noisy scRNA-seq data requires development of powerful statistical tools. We develop such a tool for two of the most commonly applied analysis in scRNA-seq studies: differential expression (DE) analysis and gene set enrichment (GSE) analysis. It is plausible that due to low statistical power, different scRNA-seq DE methods would tend to prioritize a different set of DE genes in real data applications, leading to sub-optimal performance and inconsistency of results among different methods. No comparison studies have been performed far to evaluate the effectiveness of the existing GSE methods in the scRNA-seq setting. We develop a statistical method, which we refer to as the integrative Differential expression and gene set Enrichment
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have