On the number of elements to reorder when updating a suffix array

M Léonard,L Mouchard,M Salson

doi:10.1016/j.jda.2011.01.002

Abstract

Recently new algorithms appeared for updating the Burrows–Wheeler Transform or the suffix array, when the text they index is modified. These algorithms proceed by reordering entries and the number of such reordered entries may be as high as the length of the text. However, in practice, these algorithms are faster for updating the Burrows–Wheeler Transform or the suffix array than the fastest reconstruction algorithms.In this article we focus on the number of elements to be reordered for real-life texts. We show that this number is related to LCP values and that, on average, Lave entries are reordered, where Lave denotes the average LCP value, defined as the average length of the longest common prefix between two consecutive sorted suffixes. Since we know little about the LCP distribution for real-life texts, we conduct experiments on a corpus that consists of DNA sequences and natural language texts. The results show that apart from texts containing large repetitions, the average LCP value is close to the one expected on a random text.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Discrete Algorithms	Publication Date: Jan 21, 2011
Citations: 33	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

On the number of elements to reorder when updating a suffix array

Abstract

Talk to us

Similar Papers

More From: Journal of Discrete Algorithms

Lead the way for us

Similar Papers

Parallel distributed memory construction of suffix and longest common prefix arrays
Patrick Flick ... Srinivas Aluru
-
Patrick Flick, et. al.Patrick Flick ... Srinivas Aluru
15 Nov 2015
15 Nov 2015

A fast algorithm for constructing suffix arrays for DNA alphabets
Zeinab Rabea ... Magdi Zakaria
Journal of King Saud University - Computer and Information Sciences | VOL. 34
Zeinab Rabea, et. al.Zeinab Rabea ... Magdi Zakaria
06 May 2022
Journal of King Saud University - Computer and Information Sciences | VOL. 34

Parallel suffix array and least common prefix for the GPU
Mrinal Deo ... Sean Keely
ACM SIGPLAN Notices | VOL. 48
Mrinal Deo, et. al.Mrinal Deo ... Sean Keely
23 Feb 2013
ACM SIGPLAN Notices | VOL. 48

Efficient Substring Discovery Using Suffix, LCP Array and Algorithm-Architecture Interaction
Anindya Poddar
-
Anindya PoddarAnindya Poddar
10 Jun 2022
10 Jun 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the number of elements to reorder when updating a suffix array

Abstract

Talk to us

Similar Papers

More From: Journal of Discrete Algorithms