Visualization and analysis of RNA-Seq assembly graphs.

Fahmi W Nazarie,Sz-Hau Chen,Geoffrey J Faulkner,Kim M Summers,Tim Angus,Barbara Shih,Mick Watson,Harpreet K Saini,Anton J Enright,Stijn Van Dongen,Karsten Klein,Mark W Barnett,Tom C Freeman

doi:10.1093/nar/gkz599

Abstract

RNA-Seq is a powerful transcriptome profiling technology enabling transcript discovery and quantification. Whilst most commonly used for gene-level quantification, the data can be used for the analysis of transcript isoforms. However, when the underlying transcript assemblies are complex, current visualization approaches can be limiting, with splicing events a challenge to interpret. Here, we report on the development of a graph-based visualization method as a complementary approach to understanding transcript diversity from short-read RNA-Seq data. Following the mapping of reads to a reference genome, a read-to-read comparison is performed on all reads mapping to a given gene, producing a weighted similarity matrix between reads. This is used to produce an RNA assembly graph, where nodes represent reads and edges similarity scores between them. The resulting graphs are visualized in 3D space to better appreciate their sometimes large and complex topology, with other information being overlaid on to nodes, e.g. transcript models. Here we demonstrate the utility of this approach, including the unusual structure of these graphs and how they can be used to identify issues in assembly, repetitive sequences within transcripts and splice variants. We believe this approach has the potential to significantly improve our understanding of transcript complexity.

Highlights

The advent of generation sequencing platforms enables new approaches to solving a variety of problems in medicine, agriculture, evolution and the environment
In order to explain the observed anomalies in the graph for this gene, we investigated the genomic origin of reads mapping to loop junctions using BLAST
Many tools and analysis pipelines already exist to process these data from the DNA sequencer, through mapping to a genome or de novo assembly, and summarize these data down to read counts per gene/transcript. These data are ready for differential gene expression or cluster-based analyses

Summary

Introduction

The advent of generation sequencing platforms enables new approaches to solving a variety of problems in medicine, agriculture, evolution and the environment. RNA-sequencing (RNA-Seq) based transcriptome analyses are used routinely as an alternative to microarrays for measuring transcript abundance, as well as offering the potential for gene and non-coding transcript discovery, splice variant and genome variance analyses [1,2]. Data are typically summarized by counting the number of sequencing reads that map to genomic features of interest, e.g. genes. These measures are used as the basis for determining the level of expression in a given sample and differential expression between samples. A large number of pipelines for the analysis of RNA-Seq data have been developed to go from the output of a sequencing machine, to sequence assembly, and on to the quantification of gene expression [3,4,5]. Many aspects of the analysis of these data remain computationally expensive and limiting, and tools are still under active development [6,7,8,9]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nucleic acids research	Publication Date: Jul 15, 2019
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Visualization and analysis of RNA-Seq assembly graphs.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nucleic acids research

Lead the way for us

Similar Papers

Assembly Graph Browser: interactive visualization of assembly graphs.
Alla Mikheenko ... Mikhail Kolmogorov
Bioinformatics | VOL. 35
Alla Mikheenko, et. al.Alla Mikheenko ... Mikhail Kolmogorov
04 Feb 2019
Bioinformatics | VOL. 35

Partitioning RNAs by length improves transcriptome reconstruction from short-read RNA-seq data.
Francisca Rojas Ringeling ... Tim Reska
Nature biotechnology | VOL. 40
Francisca Rojas Ringeling, et. al.Francisca Rojas Ringeling ... Tim Reska
10 Jan 2022
Nature biotechnology | VOL. 40

NanoAsPipe: A transcriptome analysis and alternative splicing detection pipeline for MinION long-read RNA-seq
Kan Liu ... Chi Zhang
-
Kan Liu, et. al.Kan Liu ... Chi Zhang
01 Nov 2017
01 Nov 2017

Bayesian nonparametric discovery of isoforms and individual specific quantification
Derek Aguiar ... Bianca Dumitrascu
Nature Communications | VOL. 9
Derek Aguiar, et. al.Derek Aguiar ... Bianca Dumitrascu
27 Apr 2018
Nature Communications | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Visualization and analysis of RNA-Seq assembly graphs.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nucleic acids research