Abstract

The Closest String Problem is defined as follows. Let $$S$$S be a set of $$k$$k strings $$\{s_1,\ldots ,s_k\}$${s1,?,sk}, each of length $$\ell $$l. Find a string $$s^*$$s?, such that the maximum Hamming distance of $$s^*$$s? from each of the strings is minimized. We denote this distance with $$d$$d. The string $$s^*$$s? is called a consensus string. In this paper we present two main algorithms, the Configuration algorithm with $$O(k^2 \ell ^ k)$$O(k2lk) running time for this problem, and the Minority algorithm. The problem was introduced by Lanctot et al. [SODA'99 and (Inf Comput 185(1):41---55, 2003)]. They showed that the problem is $$\mathcal {NP}$$NP-hard and provided an approximation algorithm based on Integer Programming. Since then the closest string problem has been studied extensively both in computational biology and theoretical computer science. This research can be roughly divided into three categories: Approximate, exact and practical solutions. This paper falls under the exact solutions category. Despite the great effort to obtain efficient algorithms for this problem an algorithm with the natural running time of $$O(\ell ^ k)$$O(lk) was not known. In this paper we close this gap. Our result means that algorithms solving the closest string problem in times $$O(\ell ^2), O(\ell ^3), O(\ell ^4)$$O(l2),O(l3),O(l4) and $$O(\ell ^5)$$O(l5) exist for the cases of $$k=2,3,4$$k=2,3,4 and $$5$$5, respectively. It is known that, in fact, the cases of $$k=2,3,$$k=2,3, and $$4$$4 can be solved in linear time. No efficient algorithm is currently known for the case of $$k=5$$k=5. We prove two lemmas, the unit square lemma and the minority lemma that exploit surprising properties of the closest string problem and enable constructing the closest string in a sequential fashion. These lemmas with some additional ideas give a $$O(\ell ^2)$$O(l2) algorithm for computing a closest string of $$5$$5 binary strings. Algorithm Minority is based on these lemmas.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call