Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.

Li Song,Liliana Florea

doi:10.1186/s13742-015-0089-y

Li Song, Liliana Florea

Open Access

https://doi.org/10.1186/s13742-015-0089-y

Copy DOI

Journal: GigaScience	Publication Date: Oct 19, 2015
Citations: 445	License type: cc-by

Affiliation: Johns Hopkins University, Johns Hopkins Medicine

Abstract

BackgroundNext-generation sequencing of cellular RNA (RNA-seq) is rapidly becoming the cornerstone of transcriptomic analysis. However, sequencing errors in the already short RNA-seq reads complicate bioinformatics analyses, in particular alignment and assembly. Error correction methods have been highly effective for whole-genome sequencing (WGS) reads, but are unsuitable for RNA-seq reads, owing to the variation in gene expression levels and alternative splicing.FindingsWe developed a k-mer based method, Rcorrector, to correct random sequencing errors in Illumina RNA-seq reads. Rcorrector uses a De Bruijn graph to compactly represent all trusted k-mers in the input reads. Unlike WGS read correctors, which use a global threshold to determine trusted k-mers, Rcorrector computes a local threshold at every position in a read.ConclusionsRcorrector has an accuracy higher than or comparable to existing methods, including the only other method (SEECER) designed for RNA-seq reads, and is more time and memory efficient. With a 5 GB memory footprint for 100 million reads, it can be run on virtually any desktop or server. The software is available free of charge under the GNU General Public License from https://github.com/mourisl/Rcorrector/.Electronic supplementary materialThe online version of this article (doi:10.1186/s13742-015-0089-y) contains supplementary material, which is available to authorized users.

Highlights

Next-generation sequencing of cellular RNA (RNA-seq) has become the foundation of virtually every transcriptomic analysis
With a 5 GB memory footprint for 100 million reads, it can be run on virtually any desktop or server
The large number of reads generated from a single sample allow researchers to study the genes being expressed and estimate their expression levels, and to discover alternative splicing and other sequence variations

Summary

Introduction

Next-generation sequencing of cellular RNA (RNA-seq) has become the foundation of virtually every transcriptomic analysis. While read coverage in WGS data is largely uniform across the genome, genes and transcripts in an RNA-seq experiment have different expression levels. Read error correction: the path search algorithm As with any k-spectrum method, Rcorrector distinguishes among solid and non-solid k-mers as the basis for its correction algorithm.

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: GigaScience

Lead the way for us

Similar Papers

Intra and Interspecific Variations of Gene Expression Levels in Yeast Are Largely Neutral: (Nei Lecture, SMBE 2016, Gold Coast).
Jian-Rong Yang ... Jianzhi Zhang
Molecular Biology and Evolution | VOL. 34
Jian-Rong Yang, et. al.Jian-Rong Yang ... Jianzhi Zhang
29 May 2017
Molecular Biology and Evolution | VOL. 34

A comparative study of endoderm differentiation in humans and chimpanzees
Lauren E Blake ... John D Blischak
Genome Biology | VOL. 19
Lauren E Blake, et. al.Lauren E Blake ... John D Blischak
15 Oct 2018
Genome Biology | VOL. 19

The Relationship between Gene Network Structure and Expression Variation among Individuals and Species.
Karen E Sears ... Marcelo Rivas-Astroza
PLoS genetics | VOL. 11
Karen E Sears, et. al.Karen E Sears ... Marcelo Rivas-Astroza
28 Aug 2015
PLoS genetics | VOL. 11

Epigenetic modifications are associated with inter-species gene expression variation in primates.
Xiang Zhou ... Katelyn Michelini
Genome Biology | VOL. 15
Xiang Zhou, et. al.Xiang Zhou ... Katelyn Michelini
01 Dec 2014
Genome Biology | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: GigaScience