Abstract

<h3>Abstract</h3> <h3>Motivation</h3> The established high-throughput RNA-seq technologies usually produce paired-end reads. A challenging problem is therefore to computationally infer the alignment of entire fragments given the alignment of the two mate ends. Solving this problem essentially provide longer RNA-seq reads, and hence benefits downstream RNA-seq analysis. <h3>Results</h3> We introduce Coral, a new tool that can accurately bridge paired-end RNA-seq reads. The core of Coral is a novel optimization formulation that can capture the most reliable bridging path while also filter out false paths. An efficient dynamic programming algorithm is designed to calculate the top <i>N</i> optimum. Coral implements a consensus approach to select the best solution among the <i>N</i> candidates by taking into account the distribution of fragment length. Coral is modular, can be easily incorporated into existing RNA-seq analysis pipeline. We show that Coral can improve transcript assembly by a large margin: on average over 2377 RNA-seq samples from GTEx, the improvement (measured with adjusted precision) is 7.5% and 11.2% when Coral is incorporated with StringTie and Scallop, respectively. <h3>Availability</h3> Coral is open-source, freely available at GitHub (https://github.com/Shao-Group/coral) and Bioconda. Scripts, datasets and documentations that can reproduce all experimental results in this paper are available at https://github.com/Shao-Group/coraltest.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call