Abstract

We present algorithms computing the non-overlapping Lempel–Ziv-77 factorization and the longest previous non-overlapping factor table within small space in linear or near-linear time with the help of modern suffix tree representations fitting into limited space. With similar techniques, we show how to answer substring compression queries for the Lempel–Ziv-78 factorization with a possible logarithmic multiplicative slowdown depending on the used suffix tree representation.

Highlights

  • Given a text T of length n whose characters are drawn from an integer alphabet of size σ = nO(1), we want to study the problem of computing the non-overlapping LZSS factorization memory-efficiently with the aid of two suffix tree representations, which were used by Fischer et al [4] (Section 2.2) to compute the classic LZ77, LZSS, and LZ78 factorizations in linear time within the asymptotic space requirements of the respective suffix tree

  • We study the substring compression query problem [6], where the task is to compute the factorization of a given substring of the text in time related to the number of computed factors and possibly a logarithmic dependency on the text length

  • We used techniques introduced by Fischer et al [4], which work on the succinct suffix tree (SST) and the compressed suffix tree (CST), to tackle the non-overlapping LZSS factorization and the LZ78 substring compression query problem

Read more

Summary

Introduction

Given a text T of length n whose characters are drawn from an integer alphabet of size σ = nO(1) , we want to study the problem of computing the non-overlapping LZSS factorization memory-efficiently with the aid of two suffix tree representations, which were used by Fischer et al [4] (Section 2.2) to compute the classic LZ77, LZSS, and LZ78 factorizations in linear time within the asymptotic space requirements of the respective suffix tree. . n] of length n whose characters are drawn from an integer alphabet with size σ = nO(1) , we can compute its non-overlapping LZSS factorization in O(e−1 n) time using (1 + e)n lg n + O(n) bits (excluding the read-only text T); or in O(n lge n) time using O(n lg σ ) bits, for a selectable constant e ∈

Preliminaries
Non-Overlapping LZSS
The Factorization Algorithm
Complexity Bounds
Storing the Factorization
Computing LPnF
Substring Compression Query Problem
Related Substring Compression Query Problems
LZ78 Factorization
Linear-Time Computation
Outline
Space-Efficient Computation
Navigation in Small Space
LZ78 Coding
Centroid-Path Decomposed Suffix Tree
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call