Digital Trees and Memoryless Sources: from Arithmetics to Analysis

Philippe Flajolet,Brigitte Vallée,Mathieu Roux

doi:10.46298/dmtcs.2799

Abstract

Digital trees, also known as $\textit{"tries''}$, are fundamental to a number of algorithmic schemes, including radix-based searching and sorting, lossless text compression, dynamic hashing algorithms, communication protocols of the tree or stack type, distributed leader election, and so on. This extended abstract develops the asymptotic form of expectations of the main parameters of interest, such as tree size and path length. The analysis is conducted under the simplest of all probabilistic models; namely, the $\textit{memoryless source}$, under which letters that data items are comprised of are drawn independently from a fixed (finite) probability distribution. The precise asymptotic structure of the parameters' expectations is shown to depend on fine singular properties in the complex plane of a ubiquitous $\textit{Dirichlet series}$. Consequences include the characterization of a broad range of asymptotic regimes for error terms associated with trie parameters, as well as a classification that depends on specific $\textit{arithmetic properties}$, especially irrationality measures, of the sources under consideration.

Highlights

Known as “tries”, serve to represent finite collections of words over some finite alphabet: each subtree stemming directly from the root is associated with the subcollection of words starting with a given letter; each subtree at level two corresponds to a given prefix of length two, and so on
As noted early [7, 8, 18, 36], quantifying the main parameters of the digital tree is strongly dependent upon the location of poles in the complex plane of the fundamental
The paper by Fayolle et al [8] seems to have been the first to conduct a detailed discussion of the geometry of poles and related integration contours, with the “periodicity criterion” explicitly enunciated. As it was recognized in subsequent years, largely by Jacquet, Louchard, and Szpankowski, digital tree analyses can serve as the basis of a remarkably precise understanding of the Lempel and Ziv schemes for data compression

Summary

Introduction

Known as “tries”, serve to represent finite collections of words over some finite alphabet: each subtree stemming directly from the root is associated with the subcollection of words starting with a given letter; each subtree at level two corresponds to a given prefix of length two, and so on. The paper by Fayolle et al [8] seems to have been the first to conduct (in the binary case) a detailed discussion of the geometry of poles and related integration contours, with the “periodicity criterion” explicitly enunciated (cf Theorem 1). As it was recognized in subsequent years, largely by Jacquet, Louchard, and Szpankowski (see, e.g., [17, 23]), digital tree analyses can serve as the basis of a remarkably precise understanding of the Lempel and Ziv schemes for data compression. Similar comments apply to renewal theory and dynamical systems theory, where the periodicity–aperiodicity dichotomy (Section 2) plays a role: we refer to the works of Pollicott [29, p. 143], as well as Baladi, Cesaratto, and Vallee [3, 4, 6, 38] for a dynamical discussion

Statement of the main result

Ladders and poles

Proof of Theorem 2

Error bounds

Rational probabilities and metric aspects

Asymptotic analysis of tries

Invariance of the irrationality exponent

B Numerical aspects

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Discrete Mathematics & Theoretical Computer Science	Publication Date: Jan 1, 2010
Citations: 60	License type: cc-by

R Discovery Prime

R Discovery Prime

Digital Trees and Memoryless Sources: from Arithmetics to Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Discrete Mathematics & Theoretical Computer Science

Lead the way for us

Similar Papers

Information theory: Sources, Dirichlet series, and realistic analyses of data structures
Mathieu Roux ... Brigitte Vallée
Electronic proceedings in theoretical computer science | VOL. 63
Mathieu Roux, et. al.Mathieu Roux ... Brigitte Vallée
17 Aug 2011
Electronic proceedings in theoretical computer science | VOL. 63

The Average Profile of Suffix Trees
Mark Daniel Ward
-
Mark Daniel WardMark Daniel Ward
06 Jan 2007
06 Jan 2007

Dependence between path-length and size in random digital trees
Michael Fuchs ... Hsien-Kuei Hwang
Journal of Applied Probability | VOL. 54
Michael Fuchs, et. al.Michael Fuchs ... Hsien-Kuei Hwang
30 Nov 2017
Journal of Applied Probability | VOL. 54

Second-order statistics of morphological dilation and erosion of a memoryless source
Sangsin Na ... Tae Young Choi
IEEE Transactions on Signal Processing | VOL. 43
Sangsin Na, et. al. Sangsin Na ... Tae Young Choi
01 Jan 1995
IEEE Transactions on Signal Processing | VOL. 43

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Digital Trees and Memoryless Sources: from Arithmetics to Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Discrete Mathematics &amp; Theoretical Computer Science

More From: Discrete Mathematics & Theoretical Computer Science