Abstract

BackgroundThe advent of high throughput RNA-seq at the single-cell level has opened up new opportunities to elucidate the heterogeneity of gene expression. One of the most widespread applications of RNA-seq is to identify genes which are differentially expressed between two experimental conditions.ResultsWe present a discrete, distributional method for differential gene expression (D3E), a novel algorithm specifically designed for single-cell RNA-seq data. We use synthetic data to evaluate D3E, demonstrating that it can detect changes in expression, even when the mean level remains unchanged. Since D3E is based on an analytically tractable stochastic model, it provides additional biological insights by quantifying biologically meaningful properties, such as the average burst size and frequency. We use D3E to investigate experimental data, and with the help of the underlying model, we directly test hypotheses about the driving mechanism behind changes in gene expression.ConclusionEvaluation using synthetic data shows that D3E performs better than other methods for identifying differentially expressed genes since it is designed to take full advantage of the information available from single-cell RNA-seq experiments. Moreover, the analytical model underlying D3E makes it possible to gain additional biological insights.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-0944-6) contains supplementary material, which is available to authorized users.

Highlights

  • The advent of high throughput RNA-seq at the single-cell level has opened up new opportunities to elucidate the heterogeneity of gene expression

  • One of the most common uses of gene expression data is to identify differentially-expressed (DE) genes between two groups of replicates collected from distinct experimental conditions, e.g. stimulated vs unstimulated, mutant vs wild-type, or at separate time-points

  • We present D3E, a method based on the comparison of two probability distributions for performing differential gene expression analysis

Read more

Summary

Introduction

The advent of high throughput RNA-seq at the single-cell level has opened up new opportunities to elucidate the heterogeneity of gene expression. With the exception of SCDE [16], most common tools for preforming single-cell DE analysis - DESeq2 [18], Cuffdiff [35], limma [29] and EdgeR [30] - are all adaptations of bulk RNA-sequencing methods. They mainly focus on filtration and normalisation of the raw data, and DE genes are identified based on changes in mean expression levels. What these methods have in common is that they summarize the difference between two distributions as a single value, which can be used to test for significance

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call