Abstract
We present a four-stage algorithm that updates the Burrows–Wheeler Transform of a text T , when this text is modified. The Burrows–Wheeler Transform is used by many text compression applications and some self-index data structures. It operates by reordering the letters of a text T to obtain a new text b w t ( T ) which can be better compressed. Even though recent advances are offering this structure new applications, a major bottleneck still exists: b w t ( T ) has to be entirely reconstructed from scratch whenever T is modified. We study how standard edit operations (insertion, deletion, substitution of a letter or a factor) that transform a text T into T ′ impact b w t ( T ) . Then we present an algorithm that directly converts b w t ( T ) into b w t ( T ′ ) . Based on this algorithm, we also sketch a method for converting the suffix array of T into the suffix array of T ′ . We finally show, based on the experiments we conducted, that this algorithm, whose worst-case time complexity is O ( | T | log | T | ( 1 + log σ / log log | T | ) ) , performs really well in practice and replaces advantageously the traditional approach.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.