The bench scientist's guide to statistical analysis of RNA-Seq data

Craig R Yendrek,Elizabeth A Ainsworth,Jyothi Thimmapuram

doi:10.1186/1756-0500-5-506

Craig R Yendrek, Elizabeth A Ainsworth + Show 1 more

Open Access

https://doi.org/10.1186/1756-0500-5-506

Copy DOI

Abstract

BackgroundRNA sequencing (RNA-Seq) is emerging as a highly accurate method to quantify transcript abundance. However, analyses of the large data sets obtained by sequencing the entire transcriptome of organisms have generally been performed by bioinformatics specialists. Here we provide a step-by-step guide and outline a strategy using currently available statistical tools that results in a conservative list of differentially expressed genes. We also discuss potential sources of error in RNA-Seq analysis that could alter interpretation of global changes in gene expression.FindingsWhen comparing statistical tools, the negative binomial distribution-based methods, edgeR and DESeq, respectively identified 11,995 and 11,317 differentially expressed genes from an RNA-seq dataset generated from soybean leaf tissue grown in elevated O3. However, the number of genes in common between these two methods was only 10,535, resulting in 2,242 genes determined to be differentially expressed by only one method. Upon analysis of the non-significant genes, several limitations of these analytic tools were revealed, including evidence for overly stringent parameters for determining statistical significance of differentially expressed genes as well as increased type II error for high abundance transcripts.ConclusionsBecause of the high variability between methods for determining differential expression of RNA-Seq data, we suggest using several bioinformatics tools, as outlined here, to ensure that a conservative list of differentially expressed genes is obtained. We also conclude that despite these analytical limitations, RNA-Seq provides highly accurate transcript abundance quantification that is comparable to qRT-PCR.

Highlights

RNA sequencing (RNA-Seq) is emerging as a highly accurate method to quantify transcript abundance
Because of the high variability between methods for determining differential expression of RNA-Seq data, we suggest using several bioinformatics tools, as outlined here, to ensure that a conservative list of differentially expressed genes is obtained
We conclude that despite these analytical limitations, RNA-Seq provides highly accurate transcript abundance quantification that is comparable to qRT-PCR

Summary

Introduction

RNA sequencing (RNA-Seq) is emerging as a highly accurate method to quantify transcript abundance. Analyses of the large data sets obtained by sequencing the entire transcriptome of organisms have generally been performed by bioinformatics specialists. We provide a step-by-step guide and outline a strategy using currently available statistical tools that results in a conservative list of differentially expressed genes. We discuss potential sources of error in RNA-Seq analysis that could alter interpretation of global changes in gene expression

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Research Notes	Publication Date: Sep 14, 2012
Citations: 70	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

The bench scientist's guide to statistical analysis of RNA-Seq data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Research Notes

Lead the way for us

Similar Papers

RNA Sequencing and Gene Ontology Analysis in Acute Invasive Fungal Sinusitis.
Abel P David ... Ivan H El-Sayed
American journal of rhinology & allergy | VOL. 37
Abel P David, et. al.Abel P David ... Ivan H El-Sayed
26 Oct 2022
American journal of rhinology & allergy | VOL. 37

Differential Gene Expression in Coiled versus Flow-Diverter-Treated Aneurysms: RNA Sequencing Analysis in a Rabbit Aneurysm Model.
A Rouchaud ... W Brinjikji
American Journal of Neuroradiology | VOL. 37
A Rouchaud, et. al.A Rouchaud ... W Brinjikji
31 Dec 2015
American Journal of Neuroradiology | VOL. 37

Use of Single-Cell -Omic Technologies to Study the Gastrointestinal Tract and Diseases, From Single Cell Identities to Patient Features
Mirazul Islam ... Ken S Lau
Gastroenterology | VOL. 159
Mirazul Islam, et. al.Mirazul Islam ... Ken S Lau
14 May 2020
Gastroenterology | VOL. 159

Signaling transcript profile of the asexual intraerythrocytic development cycle of Plasmodium falciparum induced by melatonin and cAMP.
Wânia Rezende Lima ... Miriam S Moraes
Genes & Cancer | VOL. 7
Wânia Rezende Lima, et. al.Wânia Rezende Lima ... Miriam S Moraes
03 Oct 2016
Genes & Cancer | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The bench scientist's guide to statistical analysis of RNA-Seq data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Research Notes