DART: a fast and accurate RNA-seq mapper with a partitioning strategy.

Hsin-Nan Lin,Wen-Lian Hsu,Bonnie Berger

doi:10.1093/bioinformatics/btx558

Hsin-Nan Lin, Wen-Lian Hsu + Show 1 more

Open Access

https://doi.org/10.1093/bioinformatics/btx558

Copy DOI

Abstract

MotivationIn recent years, the massively parallel cDNA sequencing (RNA-Seq) technologies have become a powerful tool to provide high resolution measurement of expression and high sensitivity in detecting low abundance transcripts. However, RNA-seq data requires a huge amount of computational efforts. The very fundamental and critical step is to align each sequence fragment against the reference genome. Various de novo spliced RNA aligners have been developed in recent years. Though these aligners can handle spliced alignment and detect splice junctions, some challenges still remain to be solved. With the advances in sequencing technologies and the ongoing collection of sequencing data in the ENCODE project, more efficient alignment algorithms are highly demanded. Most read mappers follow the conventional seed-and-extend strategy to deal with inexact matches for sequence alignment. However, the extension is much more time consuming than the seeding step.ResultsWe proposed a novel RNA-seq de novo mapping algorithm, call DART, which adopts a partitioning strategy to avoid the extension step. The experiment results on synthetic datasets and real NGS datasets showed that DART is a highly efficient aligner that yields the highest or comparable sensitivity and accuracy compared to most state-of-the-art aligners, and more importantly, it spends the least amount of time among the selected aligners.Availability and implementation https://github.com/hsinnan75/DART Supplementary information Supplementary data are available at Bioinformatics online.

Full Text