Abstract

In several areas, for example in bioinformatics and in AI planning, the Shortest Common Superstring problem (SCS) and variants thereof have been successfully applied for string comparison. In this paper we consider two variants of SCS recently introduced, namely Restricted Common Superstring (RCS) and Swapped Common Superstring (SRCS). In RCS we are given a set $$S$$S of strings and a multiset $$\mathcal {M}$$M of symbols, and we look for an ordering $$\mathcal {M}_o$$Mo of $$\mathcal {M}$$M such that the number of input strings which are substrings of $$\mathcal {M}_o$$Mo is maximized. In SRCS we are given a set $$S$$S of strings and a text $$\mathcal {T}$$T, and we look for a swap ordering $$\mathcal {T}_o$$To of $$\mathcal {T}$$T (an ordering of $$\mathcal {T}$$T obtained by swapping only some pairs of adjacent symbols) such that the number of input strings which are substrings of $$\mathcal {T}_o$$To is maximized. In this paper we propose a multivariate algorithmic analysis of the complexity of the two problems, aiming at determining how different parameters influence the complexity of the two problems. We consider as interesting parameters the size of the solutions (that is the number of input strings contained in the computed superstring), the maximum length of the given input strings, the size of the alphabet over which the input strings range. First, we give two fixed-parameter algorithms, where the parameter is the size of the solution, for SRCS and lRCS (the RCS problem restricted to strings of length bounded by a parameter $$\ell $$l). Furthermore, we complement these results by showing that SRCS and lRCS do not admit a polynomial kernel unless $$NP \subseteq coNP/Poly$$NP⊆coNP/Poly. Then, we show that SRCS is APX-hard even when the input strings have length bounded by a constant (equal to $$10$$10) or are over a binary alphabet.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call