Abstract

Andersson and Nilsson introduced in 1993 a level-compressed trie (in short: LC trie) in which a full subtree of a node is compressed to a single node of degree being the size of the subtree. Recent experimental results indicated a dramatic when full subtrees are replaced by partially filled subtrees. In this paper, we provide a theoretical justification of these experimental results showing, among others, a rather moderate improvement of the search time over the original LC tries. For such an analysis, we assume that n strings are generated independently by a binary memoryless source (a generalization to Markov sources is possible) with p denoting the probability of emitting a (and q = 1 − p). We first prove that the so called α-fillup Fn(α) (i.e., the largest level in a trie with α fraction of nodes present at this level) is concentrated on two values whp (with high probability); either Fn(α) = kn or Fn(α) = kn + 1 where [EQUATION] is an integer and Φ(x) denotes the normal distribution function. This result directly yields the typical depth (search time) Dn(α) in the α-LC tries with p ≠ 1/2, namely we show that whp Dn(α) a C1 log log n where C1 = 1/| log(1 − h / log(1/√pq))| and h = −p log p − q log q is the Shannon entropy rate. This should be compared with recently found typical depth in the original LC tries which is C2 log log n where C2 = 1/| log(1 − h / log(1 / min{p, 1−p}))|. In conclusion, we observe that α affects only the lower term of the α-fillup level Fn(α), and the search time in α-LC tries is of the same order as in the original LC tries.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.