Algorithms to compute the Burrows-Wheeler Similarity Distribution

Felipe A Louza,Guilherme P Telles,Simon Gog,Liang Zhao

doi:10.1016/j.tcs.2019.03.012

Abstract

The Burrows-Wheeler transform (BWT) is a well studied text transformation widely used in data compression and text indexing. The BWT of two strings can also provide similarity measures between them, based on the observation that the more their symbols are intermixed in the transformation, the more the strings are similar. In this article we present two new algorithms to compute similarity measures based on the BWT for string collections. In particular, we present practical and theoretical improvements to the computation of the Burrows-Wheeler Similarity Distribution for all pairs of strings in a collection. Our algorithms take advantage of the BWT computed for the concatenation of all strings, and use compressed data structures that allow reducing the running time with a small memory footprint, as shown by a set of experiments with real and artificial datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Theoretical Computer Science	Publication Date: Mar 13, 2019
Citations: 38	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Algorithms to compute the Burrows-Wheeler Similarity Distribution

Abstract

Talk to us

Similar Papers

More From: Theoretical Computer Science

Lead the way for us

Similar Papers

Computing Burrows-Wheeler Similarity Distributions for String Collections
Felipe A Louza ... Liang Zhao
-
Felipe A Louza, et. al.Felipe A Louza ... Liang Zhao
01 Jan 2018
01 Jan 2018

Indexing the bijective BWT
...
-
, et. al. ...
01 Jun 2019
01 Jun 2019

Geometric Burrows-Wheeler Transform: Linking Range Searching and Text Indexing
Yu-Feng Chien ... Rahul Shah
-
Yu-Feng Chien, et. al.Yu-Feng Chien ... Rahul Shah
01 Mar 2008
01 Mar 2008

PBWT: achieving succinct data structures for parameterized pattern matching and related problems
...
-
, et. al. ...
16 Jan 2017
16 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Algorithms to compute the Burrows-Wheeler Similarity Distribution

Abstract

Talk to us

Similar Papers

More From: Theoretical Computer Science