Abstract
The aim of the work is to generalize known algorithms to solve new problems arising in bioinformatics. We consider algorithms for optimizing the edit distance between sequences, the first of which is known and the second is a hidden palindrome of arbitrary length. It is important that the length of the desired palindrome is determined as a result of optimization. In the first task, it is necessary to select a palindrome from the ensemble of palindromes defined by the second input sequence. In this case, the original sequences may not contain the desired palindrome entirely. But the second sequence contains half of the desired palindrome. The first input sequence is used for optimization. In another task, such a palindrome may be partial, that is, only a prefix is complementary to a suffix. Such a partial palindrome forms a hairpin. The new algorithms run in quadratic time, which is faster than exhaustive search of admissible palindromes. The algorithms essentially exploit the linear dependence of the edit distance on the length of a continuous deletion or insertion. On the other hand, the algorithm for solving the first task allows us to calculate the similarity of a given sequence to any palindrome. However, in general, comparing two different sequences does not reduce to finding palindromes in each of them. Fast search for suboptimal solutions is also discussed. Software implementations of the considered algorithms are created. They are available at http://lab6.iitp.ru/-/pali. Some examples of nucleotide sequences with degenerate inverted repeats are given. In particular, we consider inverted repeats in noncoding regions of plastid DNA in flowering plants as well as microRNA genes. The possible application of our method to the search for conservative secondary structures of RNA is also discussed.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.