The Level of Residual Dispersion Variation and the Power of Differential Expression Tests for RNA-Seq Data

Gu Mi,Yanming Di

doi:10.1371/journal.pone.0120117

Abstract

RNA-Sequencing (RNA-Seq) has been widely adopted for quantifying gene expression changes in comparative transcriptome analysis. For detecting differentially expressed genes, a variety of statistical methods based on the negative binomial (NB) distribution have been proposed. These methods differ in the ways they handle the NB nuisance parameters (i.e., the dispersion parameters associated with each gene) to save power, such as by using a dispersion model to exploit an apparent relationship between the dispersion parameter and the NB mean. Presumably, dispersion models with fewer parameters will result in greater power if the models are correct, but will produce misleading conclusions if not. This paper investigates this power and robustness trade-off by assessing rates of identifying true differential expression using the various methods under realistic assumptions about NB dispersion parameters. Our results indicate that the relative performances of the different methods are closely related to the level of dispersion variation unexplained by the dispersion model. We propose a simple statistic to quantify the level of residual dispersion variation from a fitted dispersion model and show that the magnitude of this statistic gives hints about whether and how much we can gain statistical power by a dispersion-modeling approach.

Highlights

Over the last ten years, RNA-Sequencing (RNA-Seq) has become the technology of choice for quantifying gene expression changes in comparative transcriptome analysis [1]
We investigate the power and robustness of differential expression (DE) tests under realistic assumptions about the negative binomial (NB) dispersion parameters
We model the residual variation in dispersion using a normal distribution (see Equation (2)) and the level of residual variation is summarized by a simple quantity, the normal variance σ2

Summary

Introduction

Over the last ten years, RNA-Sequencing (RNA-Seq) has become the technology of choice for quantifying gene expression changes in comparative transcriptome analysis [1]. A typical RNA-Seq pipeline can be summarized as follows: purified RNA samples are converted to a library of cDNA with attached adaptors, and sequenced on an HTS platform to produce millions of short sequences from one or both ends of the cDNA fragments. These reads are aligned to either a reference genome or transcriptome (called sequence mapping), or assembled de novo without the genomic sequence. An NB regression model for describing the mean expression as a function of explanatory variables includes the following two components: 1.

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS ONE	Publication Date: Apr 7, 2015
Citations: 23	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

The Level of Residual Dispersion Variation and the Power of Differential Expression Tests for RNA-Seq Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Editor's evaluation: Comparative transcriptomic analysis reveals translationally relevant processes in mouse models of malaria
Urszula Krzych
-
Urszula KrzychUrszula Krzych
11 Aug 2021
11 Aug 2021

Association Between Muscle Strength and Modeling Estimates of Muscle Tissue Heterogeneity in Young and Old Adults.
Michael O Harris‐Love ... Catheeja Ismail
Journal of Ultrasound in Medicine | VOL. 38
Michael O Harris‐Love, et. al.Michael O Harris‐Love ... Catheeja Ismail
12 Dec 2018
Journal of Ultrasound in Medicine | VOL. 38

Author response: Tau polarizes an aging transcriptional signature to excitatory neurons and glia
Timothy Wu ...
-
Timothy Wu, et. al.Timothy Wu ...
11 May 2023
11 May 2023

Differential Allele-Specific Expression Uncovers Breast Cancer Genes Dysregulated by Cis Noncoding Mutations.
Pawel F Przytycki ... Mona Singh
Cell Systems | VOL. 10
Pawel F Przytycki, et. al.Pawel F Przytycki ... Mona Singh
01 Feb 2020
Cell Systems | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Level of Residual Dispersion Variation and the Power of Differential Expression Tests for RNA-Seq Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE