Abstract
An approximation algorithm for the shortest common superstring problem is developed, based on the Knuth-Morris-Pratt string-matching procedure and on the greedy heuristics for finding longest Hamiltonian paths in weighted graphs. Given a set R of strings, the algorithm constructs a common superstring for R in O( mn) steps where m is the number of strings in R and n is the total length of these strings. The performance of the algorithm is analysed in terms of the compression in the common superstrings constructed, that is, in terms of n− k where k is the length of the obtained superstring. We show that (n−k)⩾ 1 2 (n−k min) where k min is the length of a shortest common superstring. Hence the compression achieved by the algorithm is at least half of the maximum compression. It also seems that the lengths always satisfy k⩽2· k min but proving this remains open.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.