Abstract

De novo RNA-Seq assembly facilitates the study of transcriptomes for species without sequenced genomes, but it is challenging to select the most accurate assembly in this context. To address this challenge, we developed a model-based score, RSEM-EVAL, for evaluating assemblies when the ground truth is unknown. We show that RSEM-EVAL correctly reflects assembly accuracy, as measured by REF-EVAL, a refined set of ground-truth-based scores that we also developed. Guided by RSEM-EVAL, we assembled the transcriptome of the regenerating axolotl limb; this assembly compares favorably to a previous assembly. A software package implementing our methods, DETONATE, is freely available at http://deweylab.biostat.wisc.edu/detonate.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-014-0553-5) contains supplementary material, which is available to authorized users.

Highlights

  • RNA sequencing (RNA-Seq) technology is revolutionizing the study of species that have not yet had their genomes sequenced by enabling the large-scale analysis of their transcriptomes

  • We improve upon the state-of-the-art in transcriptome assembly evaluation by presenting the DETONATE methodology (DE novo TranscriptOme rNa-seq Assembly with or without the Truth Evaluation) and software package

  • DETONATE consists of two components: RSEM-EVAL, which does not require a ground truth reference, and REF-EVAL, which does

Read more

Summary

Introduction

RNA sequencing (RNA-Seq) technology is revolutionizing the study of species that have not yet had their genomes sequenced by enabling the large-scale analysis of their transcriptomes. Evaluation measures used in such studies can be in most cases where de novo assembly is of interest, reference sequences are either not available, incomplete or considerably diverged from the ground truth of a sample of interest, which makes the assembly evaluation task markedly more difficult. In such cases, one must resort to reference-free measures. The motivation for this measure is that better assemblies will result from a larger number of identified overlaps between the input

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call