Abstract

We extend earlier works on the relation of prefix arrays of indeterminate strings to undirected graphs and border arrays. If integer array y is the prefix array for indeterminate string w, then we say w satisfies y. We use a graph theoretic approach to construct a string on a minimum alphabet size which satisfies a given prefix array. We relate the problem of finding a minimum alphabet size to finding edge clique covers of a particular graph, allowing us to bound the minimum alphabet size by n+n$n+\sqrt {n}$ for indeterminate strings, where n is the size of the prefix array. When we restrict ourselves to prefix arrays for partial words, we bound the minimum alphabet size by 2n$\left \lceil \sqrt {2n} \right \rceil $. Moreover, we show that this bound is tight up to a constant multiple by using Sidon sets. We also study the relationship between prefix arrays and border arrays. We give necessary and sufficient conditions for a border array and prefix array to be satisfied by the same indeterminate string. We show that the slowly-increasing property completely characterizes border arrays for indeterminate strings, whence there are exactly Cn distinct border arrays of size n for indeterminate strings (here Cn is the nth Catalan number). We give an algorithm to enumerate all prefix arrays for partial words of a given size, n. Our algorithm has a time complexity of n3 times the output size, that is, the number of valid prefix arrays for partial words of length n. We also bound the number of prefix arrays for partial words of a given size using Stirling numbers of the second kind.

Highlights

  • Strings are sequences of letters from a given alphabet

  • We study the relationship between prefix arrays and border arrays

  • We show that the slowly-increasing property completely characterizes border arrays for indeterminate strings, whence there are exactly Cn distinct border arrays of size n for indeterminate strings

Read more

Summary

Introduction

Strings are sequences of letters from a given alphabet. They have been extensively studied and several generalizations have been proposed in the literature which include indeterminate strings and strings with don’t cares [10, 1, 3]. Christodoulakis et al [4] described an algorithm for computing an indeterminate string corresponding to a given feasible prefix array. Such algorithmic characterizations of prefix arrays are interesting from a theoretical point of view, and from a practical point of view, e.g., they help in the design of methods for randomly generating prefix arrays for software testing. Christodoulakis et al [4] established quite unexpected connections between indeterminate strings, prefix arrays, and undirected graphs.

Preliminaries
Constructing Indeterminate Strings for Prefix Arrays
Connecting Prefix Arrays and Border Arrays
Restricting Prefix Arrays to Partial Words
Conclusion and Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call