Improved Approximation Results on the Shortest Common Supersequence Problem

Zvi Gotthilf,Moshe Lewenstein

doi:10.1007/978-3-642-03784-9_27

Abstract

The problem of finding the Shortest Common Supersequence (SCS) of an arbitrary number of input strings is a well-studied problem. Given a set L of k strings, s 1, s 2, ..., s k , over an alphabet Σ, we say that their SCS is the shortest string that contains each of the input strings as a subsequence. The problem is known to be NP-hard [8] even over binary alphabet [12]. In this paper we focus on approximating two NP-hard variants of the SCS problem. For the first variant, where all input strings are of length 2, we present a \(2 - \frac {2}{1 + \log{n}\log{\log{n}}}\) approximation algorithm, where |Σ| = n. This result immediately improves the \(2 - \frac {4}{n+1}\) approximation algorithm presented in [17]. Moreover, we present a \(\frac{7}{6}\) (\(\approx 1.166\bar{6}\)) approximation algorithm for the restricted variant (but still NP-hard) where all input strings are of length 2 and every character in Σ has at most 3 occurrences in L.KeywordsApproximation AlgorithmSIAM JournalCleanup ProcessInput StringLonge Common SubsequenceThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Full Text