Abstract
BackgroundMetagenomic sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification; i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes software tools for fast and accurate metagenomic read classification are urgently needed.ResultsWe present cuCLARK, a read-level classifier for CUDA-enabled GPUs, based on the fast and accurate classification of metagenomic sequences using reduced k-mers (CLARK) method. Using the processing power of a single Titan X GPU, cuCLARK can reach classification speeds of up to 50 million reads per minute. Corresponding speedups for species- (genus-)level classification range between 3.2 and 6.6 (3.7 and 6.4) compared to multi-threaded CLARK executed on a 16-core Xeon CPU workstation.ConclusioncuCLARK can perform metagenomic read classification at superior speeds on CUDA-enabled GPUs. It is free software licensed under GPL and can be downloaded at https://github.com/funatiq/cuclark free of charge.
Highlights
Metagenomic sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments
Experimental setup Experimental results have been obtained by running classification of metagenomic sequences using reduced k-mers (CLARK), cuCLARK and Kraken on a workstation featuring an Intel Xeon E52683v4 16-core processor, 128 GB of DDR4 RAM and a Compute unified device architecture (CUDA)-capable Graphics processing unit (GPU), namely a Pascal-based NVIDIA Titan X with 12 GB of GDRR5X graphics memory
We have tested the scalability of cuCLARK with an Intel Xeon E5-2670v2 Central processing unit (CPU) with 64 GB of DDR3 RAM and four Kepler-based NVIDIA GeForce GTX Titan GPUs each providing 6 GB of GDDR5 video RAM
Summary
We present cuCLARK, a read-level classifier for CUDA-enabled GPUs, based on the fast and accurate classification of metagenomic sequences using reduced k-mers (CLARK) method. Using the processing power of a single Titan X GPU, cuCLARK can reach classification speeds of up to 50 million reads per minute. Corresponding speedups for species- (genus-)level classification range between 3.2 and 6.6 (3.7 and 6.4) compared to multi-threaded CLARK executed on a 16-core Xeon CPU workstation. Conclusion: cuCLARK can perform metagenomic read classification at superior speeds on CUDA-enabled GPUs. Conclusion: cuCLARK can perform metagenomic read classification at superior speeds on CUDA-enabled GPUs It is free software licensed under GPL and can be downloaded at https://github.com/funatiq/cuclark free of charge
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.