Abstract

The edit distance between given two strings X and Y is the minimum number of edit operations that transform X into Y without performing multiple operations that involve the same position. Ordinarily, string editing is based on character insert, delete, and substitute operations. Motivated from the facts that substring reversals are observed in genomic sequences, and it is not always possible to transform a given sequence X into a given sequence Y by reversals alone (e.g., X is all 0's, and Y is all 1's), Muthukrishnan and Sahinalp [S. Muthukrishnan, S.C. Sahinalp, Approximate nearest neighbors and sequence comparison with block operations, in: Proc. ACM Symposium on Theory of Computing (STOC), 2000, pp. 416–424; S. Muthukrishnan, S.C. Sahinalp, An improved algorithm for sequence comparison with block reversals, Theoretical Computer Science 321 (1) (2004) 95–101] considered a “simple” well-defined edit distance model in which the edit operations are: replace a character, and reverse and replace a substring. A substring of X can only be reversed if the reversal results in a match in the same position in Y. The cost of each character replacement and substring reversal is 1. The distance in this case is defined only when | X | = | Y | = n . There is an algorithm for computing the distance in this model with worst-case time complexity O ( n log 2 n ) [S. Muthukrishnan, S.C. Sahinalp, An improved algorithm for sequence comparison with block reversals, Theoretical Computer Science 321 (1) (2004) 95–101]. We present a dynamic programming algorithm with worst-case time complexity O ( n 2 ) but its expected running-time is O ( n ) . In our dynamic programming solution the weights of edit operations can vary for different substitutions, and the costs of reversals can be a function of the reversal-length.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.