Abstract

The edit distance (ED) and longest common subsequence (LCS) are two fundamental problems which quantify how similar two strings are to one another. In this paper, we first consider these problems in the asymmetric streaming model introduced by Andoni, Krauthgamer and Onak [Andoni et al., 2010] (FOCS'10) and Saks and Seshadhri [Saks and Seshadhri, 2013] (SODA'13). In this model we have random access to one string and streaming access the other one. Our main contribution is a constant factor approximation algorithm for ED with memory O(n^δ) for any constant δ > 0. In addition to this, we present an upper bound of O _e(√n) on the memory needed to approximate ED or LCS within a factor 1±e. All our algorithms are deterministic and run in polynomial time in a single pass. We further study small-space approximation algorithms for ED, LCS, and longest increasing sequence (LIS) in the non-streaming setting. Here, we design algorithms that achieve 1 ± e approximation for all three problems, where e > 0 can be any constant and even slightly sub-constant. Our algorithms only use poly-logarithmic space while maintaining a polynomial running time. This significantly improves previous results in terms of space complexity, where all known results need to use space at least Ω(√n). Our algorithms make novel use of triangle inequality and carefully designed recursions to save space, which can be of independent interest.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call