Fast Parallel Computation of Longest Common Prefixes

Julian Shun

doi:10.1109/sc.2014.37

Abstract

Suffix arrays and the corresponding longest common prefix (LCP) array have wide applications in bioinformatics, information retrieval and data compression. In this work, we propose and theoretically analyze new parallel algorithms for computing the LCP array given the suffix array as input. Most of our algorithms have a work and depth (parallel time) complexity related to the LCP values of the input. We also present a slight variation of Kärkkäinen and Sanders' skew algorithm that requires linear work and poly-logarithmic depth in the worst case. We present a comprehensive experimental study of our parallel algorithms along with existing parallel and sequential LCP algorithms. On a variety of real-world and artificial strings, we show that on a 40-core shared-memory machine our fastest algorithm is up to 2.3 times faster than the fastest existing parallel algorithm, and up to 21.8 times faster than the fastest sequential LCP algorithm.

Full Text