Abstract

We study the duplication with transposition distance between strings of length n over a q-ary alphabet and their roots. In other words, we investigate the number of duplication operations of the form x = (abcd) →y = (abcbd), where x and y are strings and a, b, c and d are their substrings, needed to get a q-ary string of length n starting from the set of strings without duplications. For exact duplication, we prove that the maximal distance between a string of length at most n and its root has the asymptotic order n/logn. For approximate duplication, where a β-fraction of symbols may be duplicated incorrectly, we show that the maximal distance has a sharp transition from the order n/logn to logn at β = (q − 1)/q. The motivation for this problem comes from genomics, where such duplications represent a special kind of mutation and the distance between a given biological sequence and its root is the smallest number of transposition mutations required to generate the sequence.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.