Abstract

The computational demand of exact-search procedures has pressed the exploitation of parallel processing accelerators to reduce the execution time of many applications. However, this often imposes strict restrictions in terms of the problem size and implementation efforts, mainly due to their possibly distinct architectures. To circumvent this limitation, a new exact-search alignment tool (BowMapCL) based on the Burrows-Wheeler Transform and FM-Index is presented. Contrasting to other alternatives, BowMapCL is based on a unified implementation using OpenCL, allowing the exploitation of multiple and possibly different devices (e.g., NVIDIA, AMD/ATI, and Intel GPUs/APUs). Furthermore, to efficiently exploit such heterogeneous architectures, BowMapCL incorporates several techniques to promote its performance and scalability, including multiple buffering, work-queue task-distribution, and dynamic load-balancing, together with index partitioning, bit-encoding, and sampling. When compared with state-of-the-art tools, the attained results showed that BowMapCL (using a single GPU) is 2 × to 7.5 × faster than mainstream multi-threaded CPU BWT-based aligners, like Bowtie, BWA, and SOAP2; and up to 4 × faster than the best performing state-of-the-art GPU implementations (namely, SOAP3 and HPG-BWT). When multiple and completely distinct devices are considered, BowMapCL efficiently scales the offered throughput, ensuring a convenient load-balance of the involved processing in the several distinct devices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call