Abstract

Run-length encoded Burrows–Wheeler Transformed strings, resulting in Run-Length BWT (RLBWT), is a powerful tool for processing highly repetitive strings. We propose a new algorithm for online RLBWT working in run-compressed space, which runs in O(nlg⁡r) time and O(rlg⁡n) bits of space, where n is the length of input string S received so far and r is the number of runs in the BWT of the reversed S. We improve the state-of-the-art algorithm for online RLBWT in terms of empirical construction time. Adopting the dynamic list for maintaining a total order, we can replace rank queries in a dynamic wavelet tree on a run-length compressed string by the direct comparison of labels in a dynamic list. Enlisting the proposed online RLBWT, we can efficiently compute the LZ77 factorization in run-compressed space. The empirical results show the efficiencies of both our online RLBWT and LZ77 parsing, especially for highly repetitive strings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call