Abstract

Let $\mathbf{X} = {X_n: n = 1, 2,\dots}$ be a discrete valued stationary ergodic process distributed according to probability P. Let $\mathbf{Z}_1^n = {Z_1, Z_2,\dots, Z_n}$ be an independent realization of an n-block drawn with the same probability as X. We consider the waiting time $W_n$ defined as the first time the n-block $\mathbf{Z}_1^n$ appears in X. There are many recent results concerning this waiting time that demonstrate asymptotic properties of this random variable. In this paper, we prove that for all n the random variable $W_nP(Z_1^n)$ is approximately distributed as an exponential random variable with mean 1. We use a Poisson heuristic to provide a very simple intuition for this result, which is then formalized using the Chen-Stein method. We then rederive, with remarkable brevity, most of the known asymptotic results concerning $W_n$ and prove others as well. We further establish the surprising fact that for many sources $W_nP(\mathbf{Z}_1^n)$ is exp(1) even if the probability law for Z is not the same as that of X. We also consider the d-dimensional analog of the waiting time and prove a similar result in that setting. Nearly identical results are then derived for the recurrence time $R_n$ defined as the first time the initial N-block $\mathbf{X}_1^n$ reappears in X. We conclude by developing applications of these results to provide concise solutions to problems that stem from the analysis of the Lempel-Ziv data compression algorithm. We also consider possible applications to DNA sequence analysis.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.