Abstract

BackgroundSample size calculation is an important issue in the experimental design of biomedical research. For RNA-seq experiments, the sample size calculation method based on the Poisson model has been proposed; however, when there are biological replicates, RNA-seq data could exhibit variation significantly greater than the mean (i.e. over-dispersion). The Poisson model cannot appropriately model the over-dispersion, and in such cases, the negative binomial model has been used as a natural extension of the Poisson model. Because the field currently lacks a sample size calculation method based on the negative binomial model for assessing differential expression analysis of RNA-seq data, we propose a method to calculate the sample size.ResultsWe propose a sample size calculation method based on the exact test for assessing differential expression analysis of RNA-seq data.ConclusionsThe proposed sample size calculation method is straightforward and not computationally intensive. Simulation studies to evaluate the performance of the proposed sample size method are presented; the results indicate our method works well, with achievement of desired power.

Highlights

  • Sample size calculation is an important issue in the experimental design of biomedical research

  • One of the principal questions in designing an RNAseq experiment is: What is the optimal number of biological replicates to achieve desired statistical power? (Note: In this article, the term “sample size” is used to refer to the number of biological replicates or number of subjects.) Because RNA-seq data are counts, the Poisson distribution has been widely used to model the number of reads obtained for each gene to identify differential gene expression [8,13]

  • Based on the negative binomial model, [14,15] proposed a quantileadjusted conditional maximum likelihood procedure to create a pseudocount which lead to the development of an exact test for assessing the differential expression analysis of RNA-seq data

Read more

Summary

Introduction

Sample size calculation is an important issue in the experimental design of biomedical research. For RNA-seq experiments, the sample size calculation method based on the Poisson model has been proposed; when there are biological replicates, RNA-seq data could exhibit variation significantly greater than the mean (i.e. over-dispersion). Unlike the microarray chip, which offers only quantification of gene expression level, RNA-seq provides expression level data as well as differentially spliced variants, gene fusion, and mutation profile data. Such advantages have gradually elevated RNA-seq as the technology of choice among researchers. Based on the negative binomial model, [14,15] proposed a quantileadjusted conditional maximum likelihood procedure to create a pseudocount which lead to the development of an exact test for assessing the differential expression analysis of RNA-seq data. [16] provided a Bioconductor package, edgeR, based on the exact test

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.