BackgroundThe advent of high throughput RNA-seq at the single-cell level has opened up new opportunities to elucidate the heterogeneity of gene expression. One of the most widespread applications of RNA-seq is to identify genes which are differentially expressed between two experimental conditions.ResultsWe present a discrete, distributional method for differential gene expression (D3E), a novel algorithm specifically designed for single-cell RNA-seq data. We use synthetic data to evaluate D3E, demonstrating that it can detect changes in expression, even when the mean level remains unchanged. Since D3E is based on an analytically tractable stochastic model, it provides additional biological insights by quantifying biologically meaningful properties, such as the average burst size and frequency. We use D3E to investigate experimental data, and with the help of the underlying model, we directly test hypotheses about the driving mechanism behind changes in gene expression.ConclusionEvaluation using synthetic data shows that D3E performs better than other methods for identifying differentially expressed genes since it is designed to take full advantage of the information available from single-cell RNA-seq experiments. Moreover, the analytical model underlying D3E makes it possible to gain additional biological insights.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-0944-6) contains supplementary material, which is available to authorized users.
Read full abstract