Abstract
Single-cell transcriptomics reveals gene expression heterogeneity but suffers from stochastic dropout and characteristic bimodal expression distributions in which expression is either strongly non-zero or non-detectable. We propose a two-part, generalized linear model for such bimodal data that parameterizes both of these features. We argue that the cellular detection rate, the fraction of genes expressed in a cell, should be adjusted for as a source of nuisance variation. Our model provides gene set enrichment analysis tailored to single-cell data. It provides insights into how networks of co-expressed genes evolve across an experimental treatment. MAST is available at https://github.com/RGLab/MAST.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-015-0844-5) contains supplementary material, which is available to authorized users.
Highlights
Whole transcriptome expression profiling of single cells via RNA sequencing is the logical apex to single cell gene expression experiments
Model-based analysis of single-cell transcriptomics (MAST) is suitable for supervised analyses about differential expression of genes and gene modules, as well as unsupervised analyses of model residuals, to generate hypotheses regarding co-expression of genes
MAST accounts for the bimodality of single-cell data by jointly modeling rates of expression and positive mean expression values
Summary
Whole transcriptome expression profiling of single cells via RNA sequencing (scRNA-seq) is the logical apex to single cell gene expression experiments. In contrast to transcriptomic experiments on mRNA derived from bulk samples, this technology provides powerful multiparametric measurements of gene co-expression at the single-cell level. Single-cell expression has repeatedly been shown to exhibit a characteristic bimodal expression pattern, Second, measuring single cell gene expression might seem to obviate the need to normalize for starting RNA quantities, but recent work shows that cells scale transcript copy number with cell volume (a factor that affects gene expression globally) to maintain a constant mRNA concentration and constant biochemical reaction rates [10, 11]. Technical assay variability (e.g., mRNA quality, pre-amplification efficiency) and extrinsic biological factors (e.g., nuisance biological variability due to cell size) that globally affect transcription remain, and can significantly influence expression level measurements. Our approach allows for estimation and control of the “cellular detection rate” (CDR) while simultaneously estimating treatment effects
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.