Discovering regularities in biosequences: Challenges and applications

K. Perdikuri,A. Tsakalidis,C.H. Makris

doi:10.3233/jcm-2005-5303

Abstract

Computational methods on molecular sequence data are at the heart of computational molecular biology. Identification of known or unknown DNA and RNA motifs or regions involved in various biological processes such as initiation of transcription, gene expression and translation, or the discovery of various types of repeats are some of the applications of major concern. An accurate identification and localization of such elements will allow biologists to perform deeper studies of the structure, function and evolution of genomes. This requires the development of faster and more complex mathematical models and computer algorithms. In this work we discuss current techniques to cope with string problems in molecular sequence data. We focus on Weighted Sequences and Sequences with don't care characters, explaining the open problems and their relevance to biological applications.

Full Text