Abstract

Computational methods on molecular sequence data are at the heart of computational molecular biology. Identification of known or unknown DNA and RNA motifs or regions involved in various biological processes such as initiation of transcription, gene expression and translation, or the discovery of various types of repeats are some of the applications of major concern. An accurate identification and localization of such elements will allow biologists to perform deeper studies of the structure, function and evolution of genomes. This requires the development of faster and more complex mathematical models and computer algorithms. In this work we discuss current techniques to cope with string problems in molecular sequence data. We focus on Weighted Sequences and Sequences with don't care characters, explaining the open problems and their relevance to biological applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call