Computing matching statistics on Wheeler DFAs.

Alessio Conte,Marinella Sciortino,Nicola Prezza,Giovanni Manzini,Nicola Cotumaccio,Travis Gagie

doi:10.1109/dcc55655.2023.00023

Computing matching statistics on Wheeler DFAs.

Alessio Conte, Marinella Sciortino + Show 4 more

Open Access

https://doi.org/10.1109/dcc55655.2023.00023

Copy DOI

Journal: Proceedings. Data Compression Conference	Publication Date: Mar 1, 2023
Citations: 3

Affiliation: Azienda Ospedaliera Universitaria Pisana, University of Palermo, Ca' Foscari University of Venice, Gran Sasso Science Institute, Dalhousie University

#Longest Common Prefix Array #Longest Common Prefix + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Matching statistics were introduced to solve the approximate string matching problem, which is a recurrent subroutine in bioinformatics applications. In 2010, Ohlebusch et al. [SPIRE 2010] proposed a time and space efficient algorithm for computing matching statistics which relies on some components of a compressed suffix tree - notably, the longest common prefix (LCP) array. In this paper, we show how their algorithm can be generalized from strings to Wheeler deterministic finite automata. Most importantly, we introduce a notion of LCP array for Wheeler automata, thus establishing a first clear step towards extending (compressed) suffix tree functionalities to labeled graphs.

Full Text