Abstract

BackgroundThe Smith-Waterman algorithm is known to be a more sensitive approach than heuristic algorithms for local sequence alignment algorithms. Despite its sensitivity, a greater time complexity associated with the Smith-Waterman algorithm prevents its application to the all-pairs comparisons of base sequences, which aids in the construction of accurate phylogenetic trees. The aim of this study is to achieve greater acceleration using the Smith-Waterman algorithm (by realizing interpair block pruning and band optimization) compared with that achieved using a previous method that performs intrapair block pruning on graphics processing units (GPUs).ResultsWe present an interpair optimization method for the Smith-Waterman algorithm with the aim of accelerating the all-pairs comparison of base sequences. Given the results of the pairs of sequences, our method realizes efficient block pruning by computing a lower bound for other pairs that have not yet been processed. This lower bound is further used for band optimization. We integrated our interpair optimization method into SW#, a previous GPU-based implementation that employs variants of a banded Smith-Waterman algorithm and a banded Myers-Miller algorithm. Evaluation using the six genomes of Bacillus anthracis shows that our method pruned 88 % of the matrix cells on a single GPU and 73 % of the matrix cells on two GPUs. For the genomes of the human chromosome 21, the alignment performance reached 202 giga-cell updates per second (GCUPS) on two Tesla K40 GPUs.ConclusionsEfficient interpair pruning and band optimization makes it possible to complete the all-pairs comparisons of the sequences of the same species 1.2 times faster than the intrapair pruning method. This acceleration was achieved at the first phase of SW#, where our method significantly improved the initial lower bound. However, our interpair optimization was not effective for the comparison of the sequences of different species such as comparing human, chimpanzee, and gorilla. Consequently, our method is useful in accelerating the applications that require optimal local alignments scores for the same species. The source code is available for download from http://www-hagi.ist.osaka-u.ac.jp/research/code/.

Highlights

  • The Smith-Waterman algorithm is known to be a more sensitive approach than heuristic algorithms for local sequence alignment algorithms

  • To evaluate our interpair optimization method in terms of execution time, we compared our method with the original SW#, which uses an existing intrapair pruning method [16]

  • An interpair optimization method has been presented for accelerating the all-pairs SW comparisons of sequences

Read more

Summary

Introduction

The Smith-Waterman algorithm is known to be a more sensitive approach than heuristic algorithms for local sequence alignment algorithms. Pairwise sequence alignment identifies similar regions between two biological sequences (such as between nucleotide and protein sequences) and is useful for analyzing functional, structural, and evolutional relationships between the two. Such alignment algorithms can be classified into two groups: global and local alignment. Because the length of biological sequences can reach giga-base pairs (Gbp), many researchers have accelerated the SW algorithm using various hardware such as graphics processing units (GPUs) [2,3,4,5], single-instruction multiple-data (SIMD) enabled CPUs [6,7,8], field programmable gate arrays [9] and Xeon Phi [10]. GPUs [11] emerge as accelerators for graphics applications and for general applications [12,13,14]

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.