Abstract

Advances in high-throughput, single cell gene expression are allowing interrogation of cell heterogeneity. However, there is concern that the cell cycle phase of a cell might bias characterizations of gene expression at the single-cell level. We assess the effect of cell cycle phase on gene expression in single cells by measuring 333 genes in 930 cells across three phases and three cell lines. We determine each cell's phase non-invasively without chemical arrest and use it as a covariate in tests of differential expression. We observe bi-modal gene expression, a previously-described phenomenon, wherein the expression of otherwise abundant genes is either strongly positive, or undetectable within individual cells. This bi-modality is likely both biologically and technically driven. Irrespective of its source, we show that it should be modeled to draw accurate inferences from single cell expression experiments. To this end, we propose a semi-continuous modeling framework based on the generalized linear model, and use it to characterize genes with consistent cell cycle effects across three cell lines. Our new computational framework improves the detection of previously characterized cell-cycle genes compared to approaches that do not account for the bi-modality of single-cell data. We use our semi-continuous modelling framework to estimate single cell gene co-expression networks. These networks suggest that in addition to having phase-dependent shifts in expression (when averaged over many cells), some, but not all, canonical cell cycle genes tend to be co-expressed in groups in single cells. We estimate the amount of single cell expression variability attributable to the cell cycle. We find that the cell cycle explains only 5%–17% of expression variability, suggesting that the cell cycle will not tend to be a large nuisance factor in analysis of the single cell transcriptome.

Highlights

  • With the advent of single cell expression profiling [1,2,3,4], the assessment of cell population heterogeneity and identification of cell subpopulations from mRNA expression is achievable [5,6,7]

  • We find that cell cycle phase only accounts for a modest amount of the overall variability of gene expression within an individual cell

  • In order to assess differential expression associated with actively cycling cells, expression of 333 genes was interrogated in 930 cells, across three cell lines: H9 (HTB-176), MDA-MB-231 (HTB-26), and PC3 (CRL-1435) (Figure 1A)

Read more

Summary

Introduction

With the advent of single cell expression profiling [1,2,3,4], the assessment of cell population heterogeneity and identification of cell subpopulations from mRNA expression is achievable [5,6,7]. We have measured mRNA expression and cell cycle from 930 single cells derived from three cell lines in order to explore this hypothesis. A distinctive feature of single-cell gene expression data is the bimodality of expression values. Genes can be on (and a positive expression measure is recorded) or off (and the recorded expression is zero or negligible) [9,10]. This dichotomous characteristic of the data prevents use of the typical tools of designed experiments such as linear modeling and analysis of variance (ANOVA). After signal separation by thresholding, we model separately the frequency of expression (the fraction of cells expressing a gene) and the continuous, positive expression values. Extending our previous proposal of a two-sample semicontinuous test akin to the two-sample t-test, our new framework allows for testing arbitrary contrasts and allows the use of variance components/mixed models, bringing to bear the full power of the general linear model

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.