PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms.

Ruei-Chi Gan,Po-Jung Huang,Chi-Ching Lee,Timothy H Wu,Cheng-Hsun Chiu,Petrus Tang,Yuan-Ming Yeh,Ting-Wen Chen,Hsien-Da Huang

doi:10.1186/s12859-016-1366-1

Abstract

BackgroundNext-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared.ResultsHere, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours.ConclusionsIn this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw.

Highlights

Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests
Given a total of m virtual transcripts, for each transcript x, estimated RPKM (eRPKM) is derived from Equation 2 in which kx representing the number of contigs belonging to the virtual transcript x, nx,i representing the number of reads mapped to mapped to the ith contig belonging to the transcript x, lx,i representing the length of the ith contig belonging to transcript x
In order to solve the problem of lacking proper reference for non-model organism transcriptome analysis, we propose an analysis strategy including pooled-assembly, clustering contigs on virtual transcripts and several quantification methods

Summary

Introduction

Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. There are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. RNA-Seq has become a revolutionary tool for transcriptomic analysis with the coming-of-age high-throughput sequencing technologies [1]. For the organisms with reference genomes, a typical RNA-Seq data analysis procedure starts by mapping the short reads to the genomic or the annotated mRNA sequences [2,3,4]. The mapping results between reads and transcripts can be used to quantify the transcriptome and reveal the expression profiles. By comparing transcript profiles of organism, differences in molecular constituents of cells from different tissues, developmental stage, physiological conditions or treatments etc. can be revealed

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Dec 1, 2016
Citations: 9	License type: cc-by

R Discovery Prime

R Discovery Prime

PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Integrative omics analysis. A study based on Plasmodium falciparum mRNA and protein data
Oana A Tomescu ... Gerhard G Thallinger
BMC Systems Biology | VOL. 8
Oana A Tomescu, et. al.Oana A Tomescu ... Gerhard G Thallinger
01 Mar 2014
BMC Systems Biology | VOL. 8

FastAnnotator- an efficient transcript annotation web tool
Ting-Wen Chen ... Ruei-Chi Richie Gan
BMC Genomics | VOL. 13
Ting-Wen Chen, et. al.Ting-Wen Chen ... Ruei-Chi Richie Gan
01 Dec 2012
BMC Genomics | VOL. 13

A long-read and short-read transcriptomics approach provides the first high-quality reference transcriptome and genome annotation for Pseudotsuga menziesii (Douglas-fir).
Vera Marjorie Elauria Velasco ... J Holland
G3 (Bethesda, Md.) | VOL. 13
Vera Marjorie Elauria Velasco, et. al.Vera Marjorie Elauria Velasco ... J Holland
01 Dec 2022
G3 (Bethesda, Md.) | VOL. 13

Annotation of nerve cord transcriptome in earthworm Eisenia fetida
Vasanthakumar Ponesakki ... Sudhakar Sivasubramaniam
Genomics Data | VOL. 14
Vasanthakumar Ponesakki, et. al.Vasanthakumar Ponesakki ... Sudhakar Sivasubramaniam
12 Oct 2017
Genomics Data | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics