Abstract

We consider a string editing problem in a probabilistic framework. This problem is of considerable interest to many facets of science, most notably molecular biology and computer science. A string editing transforms one string into another by performing a series of weighted edit operations of overall maximum (minimum) cost. The problem is equivalent to finding an optimal path in a weighted grid graph. In this paper we provide several results regarding a typical behaviour of such a path. In particular, we observe that the optimal path (i.e. edit distance) is almost surely (a.s.) equal to αn for large n where α is a constant and n is the sum of lengths of both strings. More importantly, we show that the edit distance is well concentrated around its average value. In the so called independent model in which all weights (in the associated grid graph) are statistically independent, we derive some bounds for the constant α. As a by-product of our results, we also present a precise estimate of the number of alignments between two strings. To prove these findings we use techniques of random walks, diffusion limiting processes, generating functions, and the method of bounded difference.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.