Abstract

We prove the first nontrivial communication complexity lower bound for the problem of estimating the edit distance (aka Levenshtein distance) between two strings. To the best of our knowledge, this is the first computational setting in which the complexity of estimating the edit distance is provably larger than that of Hamming distance. Our lower bound exhibits a trade-off between approximation and communication, asserting, for example, that protocols with $O(1)$ bits of communication can obtain only approximation $\alpha\geq\Omega(\log d/\log\log d)$, where d is the length of the input strings. This case of $O(1)$ communication is of particular importance since it captures constant-size sketches as well as embeddings into spaces like $l_1$ and squared-$l_2$, two prevailing algorithmic approaches for dealing with edit distance. Indeed, the known nontrivial communication upper bounds are all derived from embeddings into $l_1$. By excluding low-communication protocols for edit distance, we rule out a strictly richer class of algorithms than previous results. Furthermore, our lower bound holds not only for strings over a binary alphabet but also for strings that are permutations (aka the Ulam metric). For this case, our bound nearly matches an upper bound known via embedding the Ulam metric into $l_1$. Our proof uses a new technique that relies on Fourier analysis in a rather elementary way.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.