Abstract

Numerous high-throughput sequencing studies have focused on detecting conventionally spliced mRNAs in RNA-seq data. However, non-standard RNAs arising through gene fusion, circularization or trans-splicing are often neglected. We introduce a novel, unbiased algorithm to detect splice junctions from single-end cDNA sequences. In contrast to other methods, our approach accommodates multi-junction structures. Our method compares favorably with competing tools for conventionally spliced mRNAs and, with a gain of up to 40% of recall, systematically outperforms them on reads with multiple splits, trans-splicing and circular products. The algorithm is integrated into our mapping tool segemehl (http://www.bioinf.uni-leipzig.de/Software/segemehl/).

Highlights

  • The term splicing refers to a post-transcriptional process in which the raw transcript is cleaved from intronic DNA fragments

  • While the overwhelming majority of splicing events occurs within the same pre-mRNA at consensus splice sites, some mRNAs are spliced at non-consensus sites

  • For a read of length m, the algorithm evaluates the best alignments with a limited number of mismatches, insertions and deletions for all 2(m − ) suffixes of the read and its reverse complement, where is the minimum suffix length

Read more

Summary

Introduction

The term splicing refers to a post-transcriptional process in which the raw transcript (pre-mRNA) is cleaved from intronic DNA fragments. Many transcripts derived at non-consensus splice sites may have escaped detection in the past because of the assumptions built into the in silico analysis pipelines or due to the limited throughput of earlier RNA sequencing (RNA-seq) protocols. The original version of TopHat [9] predicts exon locations from the coverage data and attempts split read alignments across neighboring exons. This algorithm was not able to detect fusion events, so a new algorithm, TopHat-Fusion [10], was published and has since been integrated into TopHat along with some other modifications to the original algorithm. The tags are aligned to exons and junctions inferred from tags mapping to consecutive exons

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call