A simple refined DNA minimizer operator enables 2-fold faster computation.

Chenxu Pan,Knut Reinert

doi:10.1093/bioinformatics/btae045

Abstract

The minimizer concept is a data structure for sequence sketching. The standard canonical minimizer selects a subset of k-mers from the given DNA sequence by comparing the forward and reverse k-mers in a window simultaneously according to a predefined selection scheme. It is widely employed by sequence analysis such as read mapping and assembly. k-mer density, k-mer repetitiveness (e.g. k-mer bias), and computational efficiency are three critical measurements for minimizer selection schemes. However, there exist trade-offs between kinds of minimizer variants. Generic, effective, and efficient are always the requirements for high-performance minimizer algorithms. We propose a simple minimizer operator as a refinement of the standard canonical minimizer. It takes only a few operations to compute. However, it can improve the k-mer repetitiveness, especially for the lexicographic order. It applies to other selection schemes of total orders (e.g. random orders). Moreover, it is computationally efficient and the density is close to that of the standard minimizer. The refined minimizer may benefit high-performance applications like binning and read mapping. The source code of the benchmark in this work is available at the github repository https://github.com/xp3i4/mini_benchmark.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: Jan 25, 2024
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A simple refined DNA minimizer operator enables 2-fold faster computation.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

Approximate string matching for high-throughput sequencing

-

01 Jan 2015
01 Jan 2015

Editor's evaluation: Artificial selection methods from evolutionary computing show promise for directed evolution of microbes
C Brandon Ogbunugafor
-
C Brandon OgbunugaforC Brandon Ogbunugafor
21 Jun 2022
21 Jun 2022

Decision letter: Artificial selection methods from evolutionary computing show promise for directed evolution of microbes
Juan Diaz-Colunga ... Christian R Landry
-
Juan Diaz-Colunga, et. al.Juan Diaz-Colunga ... Christian R Landry
21 Jun 2022
21 Jun 2022

Bacterial Foraging Algorithm Based on Activity of Bacteria for DNA Computing Sequence Design
Yao Yao ... Ran Bi
IEEE Access | VOL. 9
Yao Yao, et. al.Yao Yao ... Ran Bi
25 Dec 2020
IEEE Access | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A simple refined DNA minimizer operator enables 2-fold faster computation.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics