Abstract

The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has O(n) nodes and edges. Na et al. [11] proposed k-truncated suffix tree which is a compressed trie that represents substrings of a string whose length up to k. In this paper, we present a new data structure called k-truncated DAWGs, which can be obtained by pruning the DAWGs. We show that the size complexity of the k-truncated DAWG of a string y of length n is \(O(\min \{n,kz\})\) which is equal to the truncated suffix tree’s one, where z is the size of LZ77 factorization of y. We also present an \(O(n\log \sigma )\) time and \(O(\min \{ n,kz\})\) space algorithm for constructing the k-truncated DAWG of y, where \(\sigma \) is the alphabet size. As an application of the truncated DAWGs, we show that the set \( MAW _k(y)\) of all minimal absent words of y whose length is smaller than or equal to k can be computed by using k-truncated DAWG of y in \(O(\min \{ n, kz\} + | MAW _k(y)|)\) time and \(O(\min \{ n,kz\})\) working space.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call