Abstract

BackgroundDue the computational complexity of sequence alignment algorithms, various accelerated solutions have been proposed to speedup this analysis. NVBIO is the only available GPU library that accelerates sequence alignment of high-throughput NGS data, but has limited performance. In this article we present GASAL2, a GPU library for aligning DNA and RNA sequences that outperforms existing CPU and GPU libraries.ResultsThe GASAL2 library provides specialized, accelerated kernels for local, global and all types of semi-global alignment. Pairwise sequence alignment can be performed with and without traceback. GASAL2 outperforms the fastest CPU-optimized SIMD implementations such as SeqAn and Parasail, as well as NVIDIA’s own GPU-based library known as NVBIO. GASAL2 is unique in performing sequence packing on GPU, which is up to 750x faster than NVBIO. Overall on Geforce GTX 1080 Ti GPU, GASAL2 is up to 21x faster than Parasail on a dual socket hyper-threaded Intel Xeon system with 28 cores and up to 13x faster than NVBIO with a query length of up to 300 bases and 100 bases, respectively. GASAL2 alignment functions are asynchronous/non-blocking and allow full overlap of CPU and GPU execution. The paper shows how to use GASAL2 to accelerate BWA-MEM, speeding up the local alignment by 20x, which gives an overall application speedup of 1.3x vs. CPU with up to 12 threads.ConclusionsThe library provides high performance APIs for local, global and semi-global alignment that can be easily integrated into various bioinformatics tools.

Highlights

  • Due the computational complexity of sequence alignment algorithms, various accelerated solutions have been proposed to speedup this analysis

  • Input dataset and execution platforms To evaluate the performance of GASAL2 we performed one-to-one pairwise alignments between two set of sequences

  • We considered the case of DNA read mapping

Read more

Summary

Introduction

Due the computational complexity of sequence alignment algorithms, various accelerated solutions have been proposed to speedup this analysis. Sequence alignment is the process of editing two or more sequences using gaps and substitutions such that they closely match each other. There are two types of sequence alignment algorithms for biological sequences: global alignment and local alignment The former is performed using the Needleman-Wunsch algorithm [8] (NW), while Smith-Waterman algorithm [9] (SW) is used for the latter. Both algorithms have been improved by Gotoh [10] to use affine-gap penalties. These alignment algorithms can be divided into the following classes:

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call