CUDAMPF: a multi-tiered parallel framework for accelerating protein sequence search in HMMER on CUDA-enabled GPU.

Hanyu Jiang,Narayan Ganesan

doi:10.1186/s12859-016-0946-4

Hanyu Jiang, Narayan Ganesan

Open Access

https://doi.org/10.1186/s12859-016-0946-4

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Feb 27, 2016
Citations: 32	License type: CC BY 4.0

Affiliation: Stevens Institute of Technology

Abstract

BackgroundHMMER software suite is widely used for analysis of homologous protein and nucleotide sequences with high sensitivity. The latest version of hmmsearch in HMMER 3.x, utilizes heuristic-pipeline which consists of MSV/SSV (Multiple/Single ungapped Segment Viterbi) stage, P7Viterbi stage and the Forward scoring stage to accelerate homology detection. Since the latest version is highly optimized for performance on modern multi-core CPUs with SSE capabilities, only a few acceleration attempts report speedup. However, the most compute intensive tasks within the pipeline (viz., MSV/SSV and P7Viterbi stages) still stand to benefit from the computational capabilities of massively parallel processors.ResultsA Multi-Tiered Parallel Framework (CUDAMPF) implemented on CUDA-enabled GPUs presented here, offers a finer-grained parallelism for MSV/SSV and Viterbi algorithms. We couple SIMT (Single Instruction Multiple Threads) mechanism with SIMD (Single Instructions Multiple Data) video instructions with warp-synchronism to achieve high-throughput processing and eliminate thread idling. We also propose a hardware-aware optimal allocation scheme of scarce resources like on-chip memory and caches in order to boost performance and scalability of CUDAMPF. In addition, runtime compilation via NVRTC available with CUDA 7.0 is incorporated into the presented framework that not only helps unroll innermost loop to yield upto 2 to 3-fold speedup than static compilation but also enables dynamic loading and switching of kernels depending on the query model size, in order to achieve optimal performance.ConclusionsCUDAMPF is designed as a hardware-aware parallel framework for accelerating computational hotspots within the hmmsearch pipeline as well as other sequence alignment applications. It achieves significant speedup by exploiting hierarchical parallelism on single GPU and takes full advantage of limited resources based on their own performance features. In addition to exceeding performance of other acceleration attempts, comprehensive evaluations against high-end CPUs (Intel i5, i7 and Xeon) shows that CUDAMPF yields upto 440 GCUPS for SSV, 277 GCUPS for MSV and 14.3 GCUPS for P7Viterbi all with 100 % accuracy, which translates to a maximum speedup of 37.5, 23.1 and 11.6-fold for MSV, SSV and P7Viterbi respectively. The source code is available at https://github.com/Super-Hippo/CUDAMPF.

Highlights

HMMER software suite is widely used for analysis of homologous protein and nucleotide sequences with high sensitivity
Optimal alignment scores are useful in studying similarity between individual sequences, the Forward scores are more meaningful in alignment of target protein sequences against a probabilistic model such as the Hidden Markov Model (HMM)
Benchmark environment In order to evaluate proposed MSV and Viterbi algorithms in CUDAMPF comprehensively, the benchmark analysis is composed of two parts: (1) the intrinsic comparison of different configurations in order to study the relationship between Graphics Processing Units (GPUs) kernel performance (GCUPS: GigaCell Update Per Second), cache hit ratio, kernel occupancy and the length of query models; (2) the extrinsic comparison of performance between CUDAMPF on GPU and hmmersearch from HMMER 3.1b2 on CPU

Summary

Introduction

HMMER software suite is widely used for analysis of homologous protein and nucleotide sequences with high sensitivity. The latest version of hmmsearch in HMMER 3.x, utilizes heuristic-pipeline which consists of MSV/SSV (Multiple/Single ungapped Segment Viterbi) stage, P7Viterbi stage and the Forward scoring stage to accelerate homology detection. Among the suite of tools in HMMER, hmmsearch is used to detect a query motif among a target database of sequences. The wide applicability of motif finding, the rapid growth of the set of protein families as well as the set of known sequences has made it target of many acceleration attempts. Optimal alignment scores are useful in studying similarity between individual sequences (as in BLAST [12] or SmithWaterman [13] algorithms for local alignment), the Forward scores are more meaningful in alignment of target protein sequences against a probabilistic model such as the HMM

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CUDAMPF: a multi-tiered parallel framework for accelerating protein sequence search in HMMER on CUDA-enabled GPU.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Unified on-chip memory allocation for SIMT architecture
Ari B Hayes ... Eddy Z Zhang
-
Ari B Hayes, et. al.Ari B Hayes ... Eddy Z Zhang
10 Jun 2014
10 Jun 2014

IBOM: An Integrated and Balanced On-Chip Memory for High Performance GPGPUs
Jianfei Wang ... Li Jiang
IEEE Transactions on Parallel and Distributed Systems | VOL. 29
Jianfei Wang, et. al.Jianfei Wang ... Li Jiang
01 Mar 2018
IEEE Transactions on Parallel and Distributed Systems | VOL. 29

Cache-emulated register file: An integrated on-chip memory architecture for high performance GPGPUs
Naifeng Jing ... Li Jiang
-
Naifeng Jing, et. al.Naifeng Jing ... Li Jiang
01 Oct 2016
01 Oct 2016

Cache-emulated register file: an integrated on-chip memory architecture for high performance GPGPUs
...
-
, et. al. ...
15 Oct 2016
15 Oct 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CUDAMPF: a multi-tiered parallel framework for accelerating protein sequence search in HMMER on CUDA-enabled GPU.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics