A fast Boyer-Moore type pattern matching algorithm for highly similar sequences.

Nadia Ben Nsira,Thierry Lecroq,Mourad Elloumi

doi:10.1504/ijdmb.2015.072101

Abstract

In the last decade, biology and medicine have undergone a fundamental change: next generation sequencing (NGS) technologies have enabled to obtain genomic sequences very quickly and at small costs compared to the traditional Sanger method. These NGS technologies have thus permitted to collect genomic sequences (genes, exomes or even full genomes) of individuals of the same species. These latter sequences are identical to more than 99%. There is thus a strong need for efficient algorithms for indexing and performing fast pattern matching in such specific sets of sequences. In this paper we propose a very efficient algorithm that solves the exact pattern matching problem in a set of highly similar DNA sequences where only the pattern can be pre-processed. This new algorithm extends variants of the Boyer-Moore exact string matching algorithm. Experimental results show that it exhibits the best performances in practice.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A fast Boyer-Moore type pattern matching algorithm for highly similar sequences.

Abstract

Talk to us

Similar Papers

More From: International journal of data mining and bioinformatics

Lead the way for us

Journal: International journal of data mining and bioinformatics	Publication Date: Jan 1, 2015
Citations: 5

Similar Papers

Next Generation Sequencing Technologies and Their Applications
Ku Chee‐Seng ... Pawitan Yudi
-
Ku Chee‐Seng, et. al.Ku Chee‐Seng ... Pawitan Yudi
19 Apr 2010
19 Apr 2010

Whole Genome Resequencing and 1000 Genomes Project
Ku Chee‐Seng ... Loy En Yun
-
Ku Chee‐Seng, et. al.Ku Chee‐Seng ... Loy En Yun
19 Apr 2010
Whole Genome Resequencing and 1000 Genomes Project
Ku Chee‐Seng ... Loy En Yun

Next generation sequencing (NGS) in oncology: lights and shadows
Margherita Nannini ... Maria A Pantaleo
Cancer Breaking News | VOL. 4
Margherita Nannini, et. al.Margherita Nannini ... Maria A Pantaleo
15 Mar 2016
Cancer Breaking News | VOL. 4

Studying the epigenome using next generation sequencing
Chee Seng Ku ... Richie Soong
Journal of Medical Genetics | VOL. 48
Chee Seng Ku, et. al.Chee Seng Ku ... Richie Soong
01 Jan 2010
Journal of Medical Genetics | VOL. 48

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A fast Boyer-Moore type pattern matching algorithm for highly similar sequences.

Abstract

Talk to us

Similar Papers

More From: International journal of data mining and bioinformatics