Abstract

The document array is a simple data structure commonly used together with the suffix array when indexing string collections. It determines to which document each suffix in the lexicographic order belongs. There exist algorithms to compute the document array in linear time from an existing suffix array, or alternatively, during suffix array construction. In this chapter we present algorithms gSAIS and gSACA-K (Louza et al., 2017) that construct the suffix array for a string collection, and we show how to modify them to also compute the document array, with the same theoretical bounds.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call