Abstract

Given two strings S and T, each of length at most n, the longest common substring (LCS) problem is to find a longest substring common to S and T. This is a classical problem in computer science with an mathcal {O}(n)-time solution. In the fully dynamic setting, edit operations are allowed in either of the two strings, and the problem is to find an LCS after each edit. We present the first solution to the fully dynamic LCS problem requiring sublinear time in n per edit operation. In particular, we show how to find an LCS after each edit operation in tilde{mathcal {O}}(n^{2/3}) time, after tilde{mathcal {O}}(n)-time and space preprocessing. This line of research has been recently initiated in a somewhat restricted dynamic variant by Amir et al. [SPIRE 2017]. More specifically, the authors presented an tilde{mathcal {O}}(n)-sized data structure that returns an LCS of the two strings after a single edit operation (that is reverted afterwards) in tilde{mathcal {O}}(1) time. At CPM 2018, three papers (Abedin et al., Funakoshi et al., and Urabe et al.) studied analogously restricted dynamic variants of problems on strings; specifically, computing the longest palindrome and the Lyndon factorization of a string after a single edit operation. We develop dynamic sublinear-time algorithms for both of these problems as well. We also consider internal LCS queries, that is, queries in which we are to return an LCS of a pair of substrings of S and T. We show that answering such queries is hard in general and propose efficient data structures for several restricted cases.

Highlights

  • Given two strings S and T, each of length at most n, the longest common substring (LCS) problem, known as the longest common factor problem, is to find a longest substring common to S and T

  • Amir et al [11] introduced a restricted dynamic variant, where any single edit operation is allowed and is reverted afterwards. We call this problem LCS after One Edit. They presented an Õ (n)-sized data structure that can be constructed in Õ (n) time supporting Õ (1)-time computation of an LCS, after one edit operation is applied on S

  • The first fully dynamic algorithm for the LCS problem that works in strongly sublinear time per edit operation in any of the two strings

Read more

Summary

Introduction

Given two strings S and T, each of length at most n, the longest common substring (LCS) problem, known as the longest common factor problem, is to find a longest substring common to S and T. Given two strings S and T, the problem is to answer the following type of queries in an on-line manner: perform an edit operation (substitution, insertion, or deletion) on S or on T and return an LCS of the new S and T. We call this problem Fully Dynamic LCS. 6. A fully dynamic algorithm, requiring Õ ( n) time per edit, for computing a longest Lyndon substring of string S as well as maintaining a representation of the Lyndon factorization of S that allows us to efficiently extract the t-th element of the factorization in Õ (1) time. In particular, we greatly simplify the algorithm for the fully dynamic LCS problem, we provide a more detailed study of internal LCS queries and we include sections and proofs missing from the preliminary version due to space constraints

Preliminaries
Internal LCS Queries
A Lower Bound Based on Set Disjointness
Auxiliary Data Structures Over the Suffix Tree
Internal Queries for Special Substrings
Three Substrings LCS Queries
LCS After One Substitution Per String
LCS Contains a Changed Position in Exactly One of the Strings
LCS Contains a Changed Position in Each of the Strings
Fully Dynamic LCS
Fully Dynamic Longest Repeat
General Scheme for Dynamic Problems on Strings
Fully Dynamic Longest Palindrome Substring
Internal Queries
Cross‐Substring Queries
Round‐Up
Fully Dynamic Longest Lyndon Substring
10 Final Remarks
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call