Abstract
Run-length encoded Burrows–Wheeler Transformed strings, resulting in Run-Length BWT (RLBWT), is a powerful tool for processing highly repetitive strings. We propose a new algorithm for online RLBWT working in run-compressed space, which runs in O(nlgr) time and O(rlgn) bits of space, where n is the length of input string S received so far and r is the number of runs in the BWT of the reversed S. We improve the state-of-the-art algorithm for online RLBWT in terms of empirical construction time. Adopting the dynamic list for maintaining a total order, we can replace rank queries in a dynamic wavelet tree on a run-length compressed string by the direct comparison of labels in a dynamic list. Enlisting the proposed online RLBWT, we can efficiently compute the LZ77 factorization in run-compressed space. The empirical results show the efficiencies of both our online RLBWT and LZ77 parsing, especially for highly repetitive strings.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have