Abstract

Author(s): Evans, SN; Wakolbinger, A | Abstract: The trie-based radix sort algorithm stores pairwise different infinite binary strings in the leaves of a binary tree in a way that the Ulam-Harris coding of each leaf equals a prefix (that is, an initial segment) of the corresponding string, with the prefixes being of minimal length so that they are pairwise different. We investigate the radix sort tree chains – the tree-valued Markov chains that arise when successively storing the finite collections of random infinite binary strings Z1,…,Zn, n = 1, 2,… according to the trie-based radix sort algorithm, where the source strings Z1, Z2,… are independent and identically distributed. We establish a bijective correspondence between the full Doob–Martin boundary of the radix sort tree chain with a symmetric Bernoulli source (that is, each Zk is a fair coin-tossing sequence) and the family of radix sort tree chains for which the common distribution of the Zk is a diffuse probability measure on {0, 1}∞. In essence, our result characterizes all the ways that it is possible to condition such a chain of radix sort trees consistently on its behavior “in the large”.

Highlights

  • Various sorting algorithms proceed by storing the data in the leaves of a tree

  • We establish a bijective correspondence between the full Doob–Martin boundary of the radix sort tree chain with a symmetric Bernoulli source and the family of radix sort tree chains for which the common distribution of the Zk is a diffuse probability measure on {0, 1}∞

  • If the data are infinite binary strings z1, . . . , zn ∈ {0, 1}∞, a natural choice for the tree is the rooted binary tree with n leaves chosen such that the Ulam-Harris coding of each of the leaves coincides with a finite initial segment of one of the zj, and such that these initial segments are pairwise different and have minimal length

Read more

Summary

Introduction

Various sorting algorithms proceed by storing the data in the leaves of a tree. If the data are infinite binary strings z1, . . . , zn ∈ {0, 1}∞, a natural choice for the tree is the rooted binary tree with n leaves chosen such that the Ulam-Harris coding of each of the leaves coincides with a finite initial segment (otherwise called a prefix or left factor) of one of the zj , and such that these initial segments are pairwise different and have minimal length (see below for a fuller description). Any Markov chain with initial state the trivial tree ∅ and transition probabilites that arise from those of (Rn)n∈N through the h-transform construction for some nonnegative harmonic function h (normalized, without loss of generality, so that h(∅) = 1) is an infinite bridge. The following is our main result characterizing all the ways that it is possible to condition the radix sort tree chain with inputs distributed according to fair coin-tossing measure. If we correct for this lack of uniqueness, it is natural to conjecture that a close analogue of Theorem 1.1 holds: namely, the extremal infinite bridges for the PATRICIA chain built from i.i.d. fair coin-tossing measure distributed inputs are exactly the chains of the form (ν Rn)n∈N as ν ranges over some complete collection of equivalence class representatives. Theorem 6.1 and Corollarly 6.3, in Section 6 that together establish Theorem 1.1

Forward transition probabilities
Backward transition probabilities
The Doob-Martin kernel and harmonic functions
Labeled infinite bridges
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call