Exact transcript quantification over splice graphs

Cong Ma,Carl Kingsford,Hongyu Zheng

doi:10.1186/s13015-021-00184-7

Cong Ma, Carl Kingsford + Show 1 more

Open Access

https://doi.org/10.1186/s13015-021-00184-7

Copy DOI

Abstract

BackgroundThe probability of sequencing a set of RNA-seq reads can be directly modeled using the abundances of splice junctions in splice graphs instead of the abundances of a list of transcripts. We call this model graph quantification, which was first proposed by Bernard et al. (Bioinformatics 30:2447–55, 2014). The model can be viewed as a generalization of transcript expression quantification where every full path in the splice graph is a possible transcript. However, the previous graph quantification model assumes the length of single-end reads or paired-end fragments is fixed.ResultsWe provide an improvement of this model to handle variable-length reads or fragments and incorporate bias correction. We prove that our model is equivalent to running a transcript quantifier with exactly the set of all compatible transcripts. The key to our method is constructing an extension of the splice graph based on Aho-Corasick automata. The proof of equivalence is based on a novel reparameterization of the read generation model of a state-of-art transcript quantification method.ConclusionWe propose a new approach for graph quantification, which is useful for modeling scenarios where reference transcriptome is incomplete or not available and can be further used in transcriptome assembly or alternative splicing analysis.

Highlights

The probability of sequencing a set of RNA sequencing (RNA-seq) reads can be directly modeled using the abundances of splice junctions in splice graphs instead of the abundances of a list of transcripts
We prove that optimizing a network flow on the prefix graph is equivalent to optimizing abundances of reference transcripts using the state-of-the-art transcript expression quantification formulation when all full paths of splice graphs are provided as reference transcripts, assuming modeled biases of generating a fragment are determined by the fragment sequence itself regardless of which transcript it is from
The key algorithmic contributions are a provably correct reparameterization process and the introduction of the prefix graph inspired by AhoCorasick automata for inference

Summary

Introduction

The probability of sequencing a set of RNA-seq reads can be directly modeled using the abundances of splice junctions in splice graphs instead of the abundances of a list of transcripts. We call this model graph quantification, which was first proposed by Bernard et al (Bioinformatics 30:2447–55, 2014). The previous graph quantification model assumes the length of single-end reads or paired-end fragments is fixed. FlipFlop infers network flow on its extension of splice graphs, called fragment graphs, and uses the model to further assemble transcripts. The proposed fragment graph model only retains its theoretical guarantee when the lengths of single-end reads or paired-end fragments are fixed. Our method is based on flow inference on a different extension of the splice graph

Objectives

Methods

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms for Molecular Biology	Publication Date: May 10, 2021
Citations: 4	License type: open-access

R Discovery Prime

R Discovery Prime

Exact transcript quantification over splice graphs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms for Molecular Biology

Lead the way for us

Similar Papers

Exact Transcript Quantification Over Splice Graphs.
...
-
, et. al. ...
04 Sep 2020
04 Sep 2020

MultiTrans: An Algorithm for Path Extraction Through Mixed Integer Linear Programming for Transcriptome Assembly.
Jin Zhao ... Haodi Feng
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 19
Jin Zhao, et. al.Jin Zhao ... Haodi Feng
25 May 2021
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 19

Genome-wide analysis of alternative splicing in Chlamydomonas reinhardtii
Adam Labadorf ... Alicia Link
BMC Genomics | VOL. 11
Adam Labadorf, et. al.Adam Labadorf ... Alicia Link
17 Feb 2010
BMC Genomics | VOL. 11

An expectation-maximization algorithm for probabilistic reconstructions of full-length isoforms from splice graphs.
Yi Xing ... Meenakshi Roy
Nucleic acids research | VOL. 34
Yi Xing, et. al.Yi Xing ... Meenakshi Roy
31 May 2006
Nucleic acids research | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exact transcript quantification over splice graphs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms for Molecular Biology