Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences.

Charlotte Soneson,Mark D Robinson,Michael I Love

doi:10.12688/f1000research.7563.2

Abstract

High-throughput sequencing of cDNA (RNA-seq) is used extensively to characterize the transcriptome of cells. Many transcriptomic studies aim at comparing either abundance levels or the transcriptome composition between given conditions, and as a first step, the sequencing reads must be used as the basis for abundance quantification of transcriptomic features of interest, such as genes or transcripts. Various quantification approaches have been proposed, ranging from simple counting of reads that overlap given genomic regions to more complex estimation of underlying transcript abundances. In this paper, we show that gene-level abundance estimates and statistical inference offer advantages over transcript-level analyses, in terms of performance and interpretability. We also illustrate that the presence of differential isoform usage can lead to inflated false discovery rates in differential gene expression analyses on simple count matrices but thatthis can be addressed by incorporating offsets derived from transcript-level abundance estimates. We also show that the problem is relatively minor in several real data sets. Finally, we provide an R package ( tximport) to help users integrate transcript-level abundance estimates from common quantification pipelines into count-based statistical inference engines.

Highlights

Quantification and comparison of isoform- or gene-level expression based on high throughput sequencing reads from cDNA (RNA-seq) are arguably among the most common tasks in modern computational molecular biology
Accurate transcript-level estimation and inference play an important role in deriving appropriate genelevel results, and it is imperative to continue improving abundance estimation and inference methods applicable to individual transcripts, since misestimation can propagate to the gene level
We have shown that when testing for changes in overall gene expression (DGE), traditional gene counting approaches may lead to an inflated false discovery rate compared to methods aggregating transcript-level TPM values or incorporating correction factors derived from these, for genes where the relative isoform usage differs between the compared conditions

Summary

30 Dec 2015

Any reports and responses or comments on the article can be found at the end of the article. This article is included in the RPackage gateway. This article is included in the Bioconductor gateway. The Discussion section has been extended, mainly to include a discussion of differential expression analysis methods incorporating variance estimates. The interest lies in comparing the transcriptional output between different conditions, and most RNA-seq studies can be classified as either: 1) differential gene expression (DGE) studies, where the overall transcriptional output of each gene is compared between conditions; 2) differential transcript/exon usage (DTU/DEU) studies, where the composition of a gene’s isoform abundance spectrum is compared between conditions, or 3) differential transcript expression (DTE) studies, where the interest lies in whether individual transcripts show differential expression between conditions. DTE analysis results can be represented on the individual transcript level, or aggregated to the gene level, e.g., by evaluating whether at least one of the isoforms shows evidence of differential abundance

Introduction

Discussion

Results

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: F1000Research	Publication Date: Feb 29, 2016
Citations: 2798	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: F1000Research

Lead the way for us

Similar Papers

Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences
Michael I Love ... Rob Patro
F1000Research | VOL. 4
Michael I Love, et. al.Michael I Love ... Rob Patro
19 Feb 2016
F1000Research | VOL. 4

Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences
Charlotte Soneson ... Michael I Love
F1000Research | VOL. 4
Charlotte Soneson, et. al.Charlotte Soneson ... Michael I Love
30 Dec 2015
F1000Research | VOL. 4

Comparison of normalization methods for differential gene expression analysis in RNA-Seq experiments
Elie Maza ... Mohamed Zouine
Communicative & Integrative Biology | VOL. 6
Elie Maza, et. al.Elie Maza ... Mohamed Zouine
09 Nov 2013
Communicative & Integrative Biology | VOL. 6

Longitudinal Effects of 1-Year Smoking Cessation on Human Bronchial Epithelial Transcriptome
Senani N.H Rathnayake ... Alen Faiz
CHEST | VOL. 164
Senani N.H Rathnayake, et. al.Senani N.H Rathnayake ... Alen Faiz
28 Jan 2023
CHEST | VOL. 164

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: F1000Research