Seedability: optimizing alignment parameters for sensitive sequence comparison.

Lorraine A K Ayad,Solon P Pissis,Rayan Chikhi

doi:10.1093/bioadv/vbad108

Lorraine A K Ayad, Solon P Pissis + Show 1 more

Open Access

PDF Available

https://doi.org/10.1093/bioadv/vbad108

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Most sequence alignment techniques make use of exact k-mer hits, called seeds, as anchors to optimize alignment speed. A large number of bioinformatics tools employing seed-based alignment techniques, such as , use a single value of k per sequencing technology, without a strong guarantee that this is the best possible value. Given the ubiquity of sequence alignment, identifying values of k that lead to more sensitive alignments is thus an important task. To aid this, we present , a seed-based alignment framework designed for estimating an optimal seed k-mer length (as well as a minimal number of shared seeds) based on a given alignment identity threshold. In particular, we were motivated to make more sensitive in the pairwise alignment of short sequences. The experimental results herein show improved alignments of short and divergent sequences when using the parameter values determined by in comparison to the default values of . We also show several cases of pairs of real divergent sequences, where the default parameter values of yield no output alignments, but the values output by produce plausible alignments. https://github.com/lorrainea/Seedability (distributed under GPL v3.0).

Full Text