Abstract

BackgroundHigh-throughput sequencing is now regularly used for studies of the transcriptome (RNA-seq), particularly for comparisons among experimental conditions. For the time being, a limited number of biological replicates are typically considered in such experiments, leading to low detection power for differential expression. As their cost continues to decrease, it is likely that additional follow-up studies will be conducted to re-address the same biological question.ResultsWe demonstrate how p-value combination techniques previously used for microarray meta-analyses can be used for the differential analysis of RNA-seq data from multiple related studies. These techniques are compared to a negative binomial generalized linear model (GLM) including a fixed study effect on simulated data and real data on human melanoma cell lines. The GLM with fixed study effect performed well for low inter-study variation and small numbers of studies, but was outperformed by the meta-analysis methods for moderate to large inter-study variability and larger numbers of studies.ConclusionsThe p-value combination techniques illustrated here are a valuable tool to perform differential meta-analyses of RNA-seq data by appropriately accounting for biological and technical variability within studies as well as additional study-specific effects. An R package metaRNASeq is available on the CRAN (http://cran.r-project.org/web/packages/metaRNASeq).

Highlights

  • High-throughput sequencing is regularly used for studies of the transcriptome (RNA-seq), for comparisons among experimental conditions

  • Marot et al [1] showed that the inverse normal p-value combination technique outperformed effect size combination methods or moderated t-tests [7] obtained from a linear model with a fixed study effect on several criteria, including sensitivity, area under the Receiver Operating Characteristic (ROC) curve, and gene ranking

  • Application to real data Presentation of the data The negative binomial generalized linear model (GLM) and p-value combination methods were applied to a pair of real RNA-seq studies performed to compare two human melanoma cell lines [23]

Read more

Summary

Introduction

High-throughput sequencing is regularly used for studies of the transcriptome (RNA-seq), for comparisons among experimental conditions. A limited number of biological replicates are typically considered in such experiments, leading to low detection power for differential expression As their cost continues to decrease, it is likely that additional follow-up studies will be conducted to re-address the same biological question. Several methods have been proposed to analyze microarray data arising from multiple independent but related studies; these meta-analysis techniques have the advantage of increasing the available sample size by integrating related datasets, subsequently increasing the power to detect differential expression. No other transformation has been proposed to obtain effect sizes for over-dispersed Poisson or negative binomial data

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call