Faster index for property matching

Costas S Iliopoulos,M Sohel Rahman

doi:10.1016/j.ipl.2007.09.004

Abstract

In this paper, we revisit the Property Matching problem studied by Amir et al. [Property Matching and Weighted Matching, CPM 2006] and present a better indexing scheme for the problem. In particular, the data structure by Amir et al., namely PST, requires O ( n log | Σ | + n log log n ) construction time and O ( m log | Σ | + K ) query time, where n and m are the length of, respectively, the text and the pattern, Σ is the alphabet and K is the output size. On the other hand, the construction time of our data structure, namely IDS_PIP, is dominated by the suffix tree construction time and hence is O ( n ) time for alphabets that are natural numbers from 1 to a polynomial in n and O ( n log σ ) time otherwise, where σ = min ( n , | Σ | ) . The query time is same as that of PST. Also, IDS_PIP has the advantage that it can be built on either a suffix tree or a suffix array and additionally, it retains the capability of answering normal pattern matching queries.

Full Text