Abstract

This paper presents a general technique for optimally transforming any dynamic data structure that operates on atomic and indivisible keys by constant-time comparisons, into a data structure that handles unbounded-length keys whose comparison cost is not a constant. Examples of these keys are strings, multidimensional points, multiple-precision numbers, multikey data (e.g., records), XML paths, URL addresses, etc. The technique is more general than what has been done in previous work as no particular exploitation of the underlying structure is required. The only requirement is that the insertion of a key must identify its predecessor or its successor. Using the proposed technique, online suffix tree construction can be done in worst case time $O(\log n)$ per input symbol (as opposed to amortized $O(\log n)$ time per symbol, achieved by previously known algorithms). To our knowledge, our algorithm is the first that achieves $O(\log n)$ worst case time per input symbol. Searching for a pattern of length $m$ in the resulting suffix tree takes $O(\min(m \log |\Sigma|, m + \log n) + tocc)$ time, where $tocc$ is the number of occurrences of the pattern. The paper also describes more applications and shows how to obtain alternative methods for dealing with suffix sorting, dynamic lowest common ancestors, and order maintenance. The technical features of the proposed technique for a given data structure $\mathscr{D}$ are the following ones. The new data structure $\mathscr{D}'$ is obtained from $\mathscr{D}$ by augmenting the latter with an oracle for strings, extending the functionalities of the Dietz--Sleator list for order maintenance [P. F. Dietz and D. D. Sleator, Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing, ACM, New York, 1987, pp. 365--372; A. Tsakalidis, Acta Inform., 21 (1984), pp. 101--112]. The space complexity of $\mathscr{D}'$ is $\mathscr{S}(n) + O(n)$ memory cells for storing $n$ keys, where $\mathscr{S}(n)$ denotes the space complexity of $\mathscr{D}$. Then, each operation involving $O(1)$ keys taken from $\mathscr{D}'$ requires $O(\mathscr{T}(n))$ time, where $\mathscr{T}(n)$ denotes the time complexity of the corresponding operation originally supported in $\mathscr{D}$. Each operation involving a key $y$ not stored in $\mathscr{D}'$ takes $O(\mathscr{T}(n) + |y|)$ time, where $|y|$ denotes the length of $y$. For the special case where the oracle handles suffixes of a string, the achieved insertion time is $O(\mathscr{T}(n))$.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.