Abstract

BackgroundMetagenomic sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification; i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes software tools for fast and accurate metagenomic read classification are urgently needed.ResultsWe present cuCLARK, a read-level classifier for CUDA-enabled GPUs, based on the fast and accurate classification of metagenomic sequences using reduced k-mers (CLARK) method. Using the processing power of a single Titan X GPU, cuCLARK can reach classification speeds of up to 50 million reads per minute. Corresponding speedups for species- (genus-)level classification range between 3.2 and 6.6 (3.7 and 6.4) compared to multi-threaded CLARK executed on a 16-core Xeon CPU workstation.ConclusioncuCLARK can perform metagenomic read classification at superior speeds on CUDA-enabled GPUs. It is free software licensed under GPL and can be downloaded at https://github.com/funatiq/cuclark free of charge.

Highlights

  • Metagenomic sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments

  • Experimental setup Experimental results have been obtained by running classification of metagenomic sequences using reduced k-mers (CLARK), cuCLARK and Kraken on a workstation featuring an Intel Xeon E52683v4 16-core processor, 128 GB of DDR4 RAM and a Compute unified device architecture (CUDA)-capable Graphics processing unit (GPU), namely a Pascal-based NVIDIA Titan X with 12 GB of GDRR5X graphics memory

  • We have tested the scalability of cuCLARK with an Intel Xeon E5-2670v2 Central processing unit (CPU) with 64 GB of DDR3 RAM and four Kepler-based NVIDIA GeForce GTX Titan GPUs each providing 6 GB of GDDR5 video RAM

Read more

Summary

Results

We present cuCLARK, a read-level classifier for CUDA-enabled GPUs, based on the fast and accurate classification of metagenomic sequences using reduced k-mers (CLARK) method. Using the processing power of a single Titan X GPU, cuCLARK can reach classification speeds of up to 50 million reads per minute. Corresponding speedups for species- (genus-)level classification range between 3.2 and 6.6 (3.7 and 6.4) compared to multi-threaded CLARK executed on a 16-core Xeon CPU workstation. Conclusion: cuCLARK can perform metagenomic read classification at superior speeds on CUDA-enabled GPUs. Conclusion: cuCLARK can perform metagenomic read classification at superior speeds on CUDA-enabled GPUs It is free software licensed under GPL and can be downloaded at https://github.com/funatiq/cuclark free of charge

Background
Results and discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call