Abstract

Given m documents of total length n, we consider the problem of finding a longest string common to at least d ≥ 2 of the documents. This problem is known as the longest common substring (LCS) problem and has a classic $\mathcal{O}(n)$ space and $\mathcal{O}(n)$ time solution (Weiner [FOCS’73], Hui [CPM’92]). However, the use of linear space is impractical in many applications. In this paper we show that for any trade-off parameter 1 ≤ τ ≤ n, the LCS problem can be solved in $\mathcal{O}(\tau)$ space and $\mathcal{O}(n^2/\tau)$ time, thus providing the first smooth deterministic time-space trade-off from constant to linear space. The result uses a new and very simple algorithm, which computes a τ-additive approximation to the LCS in $\mathcal{O}(n^2/\tau)$ time and $\mathcal{O}(1)$ space. We also show a time-space trade-off lower bound for deterministic branching programs, which implies that any deterministic RAM algorithm solving the LCS problem on documents from a sufficiently large alphabet in $\mathcal{O}(\tau)$ space must use $\Omega(n\sqrt{\log(n/(\tau\log n))/\log\log(n/(\tau\log n)})$ time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call