Abstract

We consider the problem of finding a minimum common string partition (MCSP) of two strings, which is an NP-hard problem. The MCSP problem is closely related to genome comparison and rearrangement, an important field in Computational Biology. In this paper, we map the MCSP problem into a graph applying a prior technique and using this graph, we develop an Integer Linear Programming (ILP) formulation for the problem. We implement the ILP formulation and compare the results with the state-of-the-art algorithms from the literature. The experimental results are found to be promising.

Highlights

  • In the minimum common string partition (MCSP) problem, we are given two related strings (S, T)

  • We present an Integer Linear Programming (ILP) formulation for the MCSP problem

  • We develop an ILP formulation for the MCSP problem using the common substring graph as follows: X

Read more

Summary

Introduction

In the minimum common string partition (MCSP) problem, we are given two related strings (S, T). Two strings are said to be related if the frequencies of each letter in the two strings match. A partition of a string S is defined as a sequence P = Given a partition P of a string S and a partition Q of a string T, we say that the pair π = < P, Q > is a common partition of (S, T) if Q is a permutation of P. The minimum common string partition problem is to find a common partition of (S, T) with the minimum number of substrings, that is to minimize c. If (S, T) = (atatgat,atgatat), an optimal solution is π = {atgat,at} and the minimum common partition size is 2. A more detailed study of the application of MCSP can be found in [1], [2] and [3]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call