Evaluation and Improvement of Fast Algorithms for Exact Matching on Genome Sequences

Simone Faro

doi:10.1007/978-3-319-38827-4_12

Abstract

With the availability of large amounts of dna data, exact matching of nucleotide sequences has become an important application in modern computational biology and in meta-genomics. In the last decade several efficient solutions for the exact string matching problem have been developed and most of them are very fast in practical cases. However when the length of the pattern is short or the alphabet size is small (as in the case of dna sequences) the problem becomes more difficult to be solved efficiently. In this paper we review and compare the most efficient solutions for the online exact matching problem appeared in the latest years when applied for searching on genome sequences. In addition we also propose some new variants of an efficient string matching algorithm. From our experimental results it turns out that the newly presented variants are very fast in most practical cases.

Full Text