HETEROGENEOUS COMPUTING TO ACCELERATE THE SEARCH OF SUPER K-MERS BASED ON MINIMIZERS

Nelson Enrique Vera-Parra,Cristian Alejandro Rojas-Quintero,Danilo Alfonso López-Sarmiento

doi:10.47839/ijc.19.4.1985

Nelson Enrique Vera-Parra, Cristian Alejandro Rojas-Quintero + Show 1 more

Open Access

https://doi.org/10.47839/ijc.19.4.1985

Copy DOI

Journal: International Journal of Computing	Publication Date: Dec 30, 2020
Citations: 1	License type: cc-by

Abstract

The k-mers processing techniques based on partitioning of the data set on the disk using minimizer-type seeds have led to a significant reduction in memory requirements; however, it has added processes (search and distribution of super k-mers) that can be intensive given the large volume of data. This paper presents a massive parallel processing model in order to enable the efficient use of heterogeneous computation to accelerate the search of super k-mers based on seeds (minimizers or signatures). The model includes three main contributions: a new data structure called CISK for representing the super k-mers, their minimizers and two massive parallelization patterns in an indexed and compact way: one for obtaining the canonical m-mers of a set of reads and another for searching for super k-mers based on minimizers. The model was implemented through two OpenCL kernels. The evaluation of the kernels shows favorable results in terms of execution times and memory requirements to use the model for constructing heterogeneous solutions with simultaneous execution (workload distribution), which perform co-processing using the current search methods of super k -mers on the CPU and the methods presented herein on GPU. The model implementation code is available in the repository: https://github.com/BioinfUD/K-mersCL.

Highlights

The search of super k-mers of a genomic read is a task that demands finding the seed of each possible k-mer and compare them with each other in order to identify those contiguous k-mers that have the same minimizer [1]
This paper proposes a massive model of parallel processing for the search of super k-mers that allows the memory requirements and the execution times to be adequate to develop efficient heterogeneous solutions with simultaneous CPU-GPUs execution
A processing model was obtained that efficiently parallelizes the search of super k-mers on many-core architectures using two new algorithms of parallelization that maximize the operational intensity and a structure of data that substantially reduces the memory requirement for the representation of the output data

Summary

Introduction

The search of super k-mers of a genomic read is a task that demands finding the seed (canonical minimizer or signature) of each possible k-mer and compare them with each other in order to identify those contiguous k-mers that have the same minimizer [1]. Due to the independence of processes between reads, the search for super k-mers is a highly suitable task to be accelerated by simultaneous heterogeneous processing: the workload is partitioned to be processed simultaneously between the CPU and the GPU(s), either through a static, dynamic [2], or hybrid distribution [3] For this type of processing to be carried out efficiently it is necessary to overcome the following challenge: the search for super k-mers is a process that has a very high and unpredictable memory requirement when it is massively parallelized because the space required depends on the data generated but not on the input data.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

HETEROGENEOUS COMPUTING TO ACCELERATE THE SEARCH OF SUPER K-MERS BASED ON MINIMIZERS

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Computing

Lead the way for us

Similar Papers

Transform coding with integer-to-integer transforms
V.K Goyal
IEEE Transactions on Information Theory | VOL. 46
V.K GoyalV.K Goyal
01 Mar 2000
IEEE Transactions on Information Theory | VOL. 46

Runtime Efficient State Compaction in Spin
J Geldenhuys ... John Rushby
-
J Geldenhuys, et. al.J Geldenhuys ... John Rushby
01 Jan 1998
01 Jan 1998

Accelerating kernel principal component analysis (KPCA) by utilizing two‐dimensional wavelet compression: applications to spectroscopic imaging
Robert D Luttrell ... Frank Vogt
Journal of Chemometrics | VOL. 22
Robert D Luttrell, et. al.Robert D Luttrell ... Frank Vogt
02 Jun 2008
Journal of Chemometrics | VOL. 22

The “BIM-sustain” experiment – simulation of BIM-supported multi-disciplinary design
Iva Kovacic ... Lars Oberwinter
Visualization in Engineering | VOL. 1
Iva Kovacic, et. al.Iva Kovacic ... Lars Oberwinter
01 Dec 2013
Visualization in Engineering | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

HETEROGENEOUS COMPUTING TO ACCELERATE THE SEARCH OF SUPER K-MERS BASED ON MINIMIZERS

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Computing