Abstract

Many genes are expressed in bursts, which can contribute to cell-to-cell heterogeneity. It is now possible to measure this heterogeneity with high throughput single cell gene expression assays (single cell qPCR and RNA-seq). These experimental approaches generate gene expression distributions which can be used to estimate the kinetic parameters of gene expression bursting, namely the rate that genes turn on, the rate that genes turn off, and the rate of transcription. We construct a complete pipeline for the analysis of single cell qPCR data that uses the mathematics behind bursty expression to develop more accurate and robust algorithms for analyzing the origin of heterogeneity in experimental samples, specifically an algorithm for clustering cells by their bursting behavior (Simulated Annealing for Bursty Expression Clustering, SABEC) and a statistical tool for comparing the kinetic parameters of bursty expression across populations of cells (Estimation of Parameter changes in Kinetics, EPiK). We applied these methods to hematopoiesis, including a new single cell dataset in which transcription factors (TFs) involved in the earliest branchpoint of blood differentiation were individually up- and down-regulated. We could identify two unique sub-populations within a seemingly homogenous group of hematopoietic stem cells. In addition, we could predict regulatory mechanisms controlling the expression levels of eighteen key hematopoietic transcription factors throughout differentiation. Detailed information about gene regulatory mechanisms can therefore be obtained simply from high throughput single cell gene expression data, which should be widely applicable given the rapid expansion of single cell genomics.

Highlights

  • Many genes are expressed in stochastic bursts: there are time periods where many transcripts are quickly produced, interspersed randomly with gaps of little or no transcriptional activity

  • We construct a pipeline for analyzing single cell gene expression data that uses the mathematics behind bursty expression

  • We provide a complete pipeline in R for analyzing single cell qPCR data, including data normalization steps, Simulated Annealing for Bursty Expression Clustering (SABEC) clustering and Estimation of Parameter changes in Kinetics (EPiK) cluster comparisons

Read more

Summary

Introduction

Many genes are expressed in stochastic bursts: there are time periods where many transcripts are quickly produced, interspersed randomly with gaps of little or no transcriptional activity. In this model, each gene can either be in an on or an off state, and the gene stochastically transitions between these states, with transcription only taking place when the gene is on. The distribution of mRNA across a population of cells is determined by the following three kinetic parameters: the rate the gene turns on (Kon), the rate the gene turns off (Koff) and the rate of transcription when the gene is on (Kt), all normalized to the rate of mRNA degradation [7] The values of these three kinetic parameters determine the distribution of mRNA transcripts within a population of cells (Fig 1B)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.