Assessment of the impact of using a reference transcriptome in mapping short RNA-Seq reads.

Shanrong Zhao

doi:10.1371/journal.pone.0101374

Abstract

RNA-Seq has become increasingly popular in transcriptome profiling. The major challenge in RNA-Seq data analysis is the accurate mapping of junction reads to their genomic origins. To detect splicing sites in short reads, many RNA-Seq aligners use reference transcriptome to inform placement of junction reads. However, no systematic evaluation has been performed to assess or quantify the benefits of incorporating reference transcriptome in mapping RNA-Seq reads. In this paper, we have studied the impact of reference transcriptome on mapping RNA-Seq reads, especially on junction ones. The same dataset were analysed with and without RefGene transcriptome, respectively. Then a Perl script was developed to analyse and compare the mapping results. It was found that about 50–55% junction reads can be mapped to the same genomic regions regardless of the usage of RefGene model. More than one-third of reads fail to be mapped without the help of a reference transcriptome. For “Alternatively” mapped reads, i.e., those reads mapped differently with and without RefGene model, the mappings without RefGene model are usually worse than their corresponding alignments with RefGene model. For junction reads that span more than two exons, it is less likely to align them correctly without the assistance of reference transcriptome. As the sequencing technology evolves, the read length is becoming longer and longer. When reads become longer, they are more likely to span multiple exons, and thus the mapping of long junction reads is actually becoming more and more challenging without the assistance of reference transcriptome. Therefore, the advantages of using reference transcriptome in the mapping demonstrated in this study are becoming more evident for longer reads. In addition, the effect of the completeness of reference transcriptome on mapping of RNA-Seq reads is discussed.

Highlights

In recent years, RNA-Seq has become a popular and powerful approach for transcriptome profiling [1,2,3,4,5,6]
Short reads generated by RNA-Seq experiments must be aligned, or ‘‘mapped’’ to a reference genome or transcriptome assembly
The number of reads aligned to each feature approximates abundances of those features in the original sample. Such measures of digital gene expression are subject to comparison among samples or treatments in a statistical framework

Summary

Introduction

RNA-Seq has become a popular and powerful approach for transcriptome profiling [1,2,3,4,5,6]. RNA-Seq has considerable advantages for examining transcriptome fine structure–for example, in the detection of novel transcripts, allelespecific expression, and alternative splicing–and provides a far more precise measurement of levels of transcripts than that of other methods such as microarray [7,8,9,10]. RNA-Seq has a much broader dynamic range than microarray, which allows for the detection of more differentially expressed genes with higher foldchange. RNA-Seq avoids technical issues in microarray related to probe performance such as cross-hybridization, limited detection range of individual probes, and nonspecific hybridization. RNA-Seq is becoming an attractive approach in the profiling of gene expression and in evaluating differential expression [11]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PloS one	Publication Date: Jul 3, 2014
Citations: 22	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Assessment of the impact of using a reference transcriptome in mapping short RNA-Seq reads.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

Impact of Gene Annotation on RNA-seq Data Analysis
Shanrong Zhao ... Baohong Zhang
-
Shanrong Zhao, et. al.Shanrong Zhao ... Baohong Zhang
14 Jan 2016
14 Jan 2016

G-SNPM - A GPU-based SNP mapping tool
Alessandro Orro ... Andrea Manconi
EMBnet.journal | VOL. 18
Alessandro Orro, et. al.Alessandro Orro ... Andrea Manconi
09 Nov 2012
EMBnet.journal | VOL. 18

A comprehensive evaluation of ensembl, RefSeq, and UCSC annotations in the context of RNA-seq read mapping and gene quantification.
Shanrong Zhao ... Baohong Zhang
BMC Genomics | VOL. 16
Shanrong Zhao, et. al.Shanrong Zhao ... Baohong Zhang
18 Feb 2015
BMC Genomics | VOL. 16

Evaluation of tools for long read RNA-seq splice-aware alignment.
Krešimir Križanović ... Amina Echchiki
Bioinformatics | VOL. 34
Krešimir Križanović, et. al.Krešimir Križanović ... Amina Echchiki
23 Oct 2017
Bioinformatics | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessment of the impact of using a reference transcriptome in mapping short RNA-Seq reads.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one