Limitations of alignment-free tools in total RNA-seq quantification

Douglas C Wu,Kevin S Ho,Jun Yao,Alan M Lambowitz,Claus O Wilke

doi:10.1186/s12864-018-4869-5

Abstract

BackgroundAlignment-free RNA quantification tools have significantly increased the speed of RNA-seq analysis. However, it is unclear whether these state-of-the-art RNA-seq analysis pipelines can quantify small RNAs as accurately as they do with long RNAs in the context of total RNA quantification.ResultWe comprehensively tested and compared four RNA-seq pipelines for accuracy of gene quantification and fold-change estimation. We used a novel total RNA benchmarking dataset in which small non-coding RNAs are highly represented along with other long RNAs. The four RNA-seq pipelines consisted of two commonly-used alignment-free pipelines and two variants of alignment-based pipelines. We found that all pipelines showed high accuracy for quantifying the expression of long and highly-abundant genes. However, alignment-free pipelines showed systematically poorer performance in quantifying lowly-abundant and small RNAs.ConclusionWe have shown that alignment-free and traditional alignment-based quantification methods perform similarly for common gene targets, such as protein-coding genes. However, we have identified a potential pitfall in analyzing and quantifying lowly-expressed genes and small RNAs with alignment-free pipelines, especially when these small RNAs contain biological variations.

Highlights

Alignment-free RNA quantification tools have significantly increased the speed of RNA-seq analysis
The benchmarking dataset we used here consists of thermostable group II intron reverse transcriptase (TGIRT)-seq libraries for four well-defined samples from the microarray/sequencing quality control consortium (MAQC [18, 19]), each obtained in triplicate [15]
The MAQC samples A and B represent universal human reference total RNA and human brain reference total RNA, respectively, that are mixed with corresponding External RNA Controls Consortium (ERCC) spike-in transcripts

Summary

Introduction

Alignment-free RNA quantification tools have significantly increased the speed of RNA-seq analysis. RNA-seq continues to pose great computational and statistical challenges These challenges range from accurately aligning sequencing reads to accurate inference of gene expression levels [1, 2]. Read assignment is carried out by aligning sequencing reads to a reference genome, such that relative gene expression levels can be inferred by the alignments at annotated gene loci [2, 7] These alignment-based methods are conceptually simple, but the read-alignment step can be timeconsuming and computationally intensive despite recent advancements in fast read aligners [4, 8, 9]. A novel method has overcome this problem by using a thermostable group II intron reverse transcriptase (TGIRT) during RNA-seq library construction [15] This method enables more comprehensive profiling of full-length structured small non-coding RNAs (sncRNA) along with long RNAs in a single RNAseq library workflow [15,16,17]. It is possible to benchmark RNA-seq quantification tools on structured small non-coding RNAs

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: Jul 3, 2018
Citations: 69	License type: open-access

R Discovery Prime

R Discovery Prime

Limitations of alignment-free tools in total RNA-seq quantification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

1,25-Dihydroxyvitamin D3 Suppresses Telomerase Expression and Human Cancer Growth through MicroRNA-498
Ravi Kasiappan ... Wenlong Bai
Journal of Biological Chemistry | VOL. 287
Ravi Kasiappan, et. al.Ravi Kasiappan ... Wenlong Bai
01 Nov 2012
1,25-Dihydroxyvitamin D3 Suppresses Telomerase Expression and Human Cancer Growth through MicroRNA-498
Ravi Kasiappan ... Wenlong Bai

MicroRNAs, the epigenetic memory and climatic adaptation in Norway spruce
Igor A Yakovlev ... Carl Gunnar Fossdal
New Phytologist | VOL. 187
Igor A Yakovlev, et. al.Igor A Yakovlev ... Carl Gunnar Fossdal
17 Jun 2010
New Phytologist | VOL. 187

Heavy Chronic Intermittent Ethanol Exposure Alters Small Noncoding RNAs in Mouse Sperm and Epididymosomes
Gregory R Rompala ... Anais Mounier
Frontiers in Genetics | VOL. 9
Gregory R Rompala, et. al.Gregory R Rompala ... Anais Mounier
08 Feb 2018
Frontiers in Genetics | VOL. 9

Profiling Extracellular Long RNA Transcriptome in Human Plasma and Extracellular Vesicles for Biomarker Discovery.
Rodosthenis S Rodosthenous ... Rebecca Reiman
iScience | VOL. 23
Rodosthenis S Rodosthenous, et. al.Rodosthenis S Rodosthenous ... Rebecca Reiman
18 May 2020
iScience | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Limitations of alignment-free tools in total RNA-seq quantification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics