DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI

Yongchao Liu,Bertil Schmidt,Douglas L Maskell

doi:10.1186/1471-2105-12-85

Yongchao Liu, Bertil Schmidt + Show 1 more

Open Access

https://doi.org/10.1186/1471-2105-12-85

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Mar 29, 2011
Citations: 89	License type: CC BY 2.0

Affiliation: Nanyang Technological University

Abstract

BackgroundNext-generation sequencing technologies have led to the high-throughput production of sequence data (reads) at low cost. However, these reads are significantly shorter and more error-prone than conventional Sanger shotgun reads. This poses a challenge for the de novo assembly in terms of assembly quality and scalability for large-scale short read datasets.ResultsWe present DecGPU, the first parallel and distributed error correction algorithm for high-throughput short reads (HTSRs) using a hybrid combination of CUDA and MPI parallel programming models. DecGPU provides CPU-based and GPU-based versions, where the CPU-based version employs coarse-grained and fine-grained parallelism using the MPI and OpenMP parallel programming models, and the GPU-based version takes advantage of the CUDA and MPI parallel programming models and employs a hybrid CPU+GPU computing model to maximize the performance by overlapping the CPU and GPU computation. The distributed feature of our algorithm makes it feasible and flexible for the error correction of large-scale HTSR datasets. Using simulated and real datasets, our algorithm demonstrates superior performance, in terms of error correction quality and execution speed, to the existing error correction algorithms. Furthermore, when combined with Velvet and ABySS, the resulting DecGPU-Velvet and DecGPU-ABySS assemblers demonstrate the potential of our algorithm to improve de novo assembly quality for de-Bruijn-graph-based assemblers.ConclusionsDecGPU is publicly available open-source software, written in CUDA C++ and MPI. The experimental results suggest that DecGPU is an effective and feasible error correction algorithm to tackle the flood of short reads produced by next-generation sequencing technologies.

Highlights

Next-generation sequencing technologies have led to the high-throughput production of sequence data at low cost
The GPU-based version takes advantage of the compute unified device architecture (CUDA) and message passing interface (MPI) parallel programming models and employs a hybrid CPU+GPU computing model to maximize the performance by overlapping the CPU and GPU computation
We have evaluated the performance of DecGPU from three perspectives: (1) the error correction quality both on simulated and real short read datasets; (2) de novo assembly quality improvement after combining our algorithm with Velvet and ABySS; and (3) the scalability with respect to different number of compute resources for the CPU-based and GPU-based versions respectively

Summary

Results

We have evaluated the performance of DecGPU from three perspectives: (1) the error correction quality both on simulated and real short read datasets; (2) de novo assembly quality improvement after combining our algorithm with Velvet (version 1.0.17) and ABySS (version 1.2.1); and (3) the scalability with respect to different number of compute resources for the CPU-based and GPU-based versions respectively. The execution speed of DecGPU is evaluated using the three real datasets in terms of: (1) scalability of the CPU-based and GPU-based versions with respect to different number of compute resources, and (2) execution time of the GPU-based version compared to that of CUDA-EC (version 1.0.1) on a single GPU. Both of the assessments are conducted on the already described computing cluster. Even though our algorithm does not show good parallel scalability with respect to different number of computing resources, the distributed feature of our algorithm does provide a feasible and flexible solution to the error correction of largescale HTSR datasets

Conclusions

Background

29. NVIDIA: Fermi

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Simultaneous compression of multiple error-corrected short-read sets for faster data transmission and better de novo assemblies.
Tao Tang ... Wenjian Wang
Briefings in functional genomics | VOL. 21
Tao Tang, et. al.Tao Tang ... Wenjian Wang
14 Jul 2022
Briefings in functional genomics | VOL. 21

Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters
Chao-Tung Yang ... Cheng-Fang Lin
Computer Physics Communications | VOL. 182
Chao-Tung Yang, et. al.Chao-Tung Yang ... Cheng-Fang Lin
16 Jul 2010
Computer Physics Communications | VOL. 182

An Ultrafast Scalable Many-Core Motif Discovery Algorithm for Multiple GPUs
Yongchao Liu ... Douglas L Maskell
-
Yongchao Liu, et. al.Yongchao Liu ... Douglas L Maskell
01 May 2011
01 May 2011

EC: an efficient error correction algorithm for short reads.
Subrata Saha ... Sanguthevar Rajasekaran
BMC Bioinformatics | VOL. Suppl 16 17
Subrata Saha, et. al.Subrata Saha ... Sanguthevar Rajasekaran
01 Dec 2015
BMC Bioinformatics | VOL. Suppl 16 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics