Abstract

We study the maximum Markov model order that can be estimated for an (individual) input sequence $x$ of length $n$ over a finite alphabet of size $\alpha $ , for popular estimators of tree models, where the estimated order is determined by the depth of a context tree estimate, and in the special case of plain Markov models, where the tree is constrained to be perfect (with all leaves at the same depth). First, we consider penalized maximum likelihood estimators where a context tree $\hat {T}$ is obtained by minimizing a cost of the form $-\log \hat {P}_{T}(x) + f(n)|S_{T}|$ , where $\hat {P}_{T}(x)$ is the ML of $x$ under a model with context tree $T$ , $S_{T}$ is the set of leaves of $T$ , and $f(n)$ is an increasing (penalization) function of $n$ (the popular BIC estimator is a special case with $f(n)=\frac {\alpha -1}{2}\log n$ ). For plain Markov models, a simple argument yields a known upper bound $\overline {k}(n) = O(\log n)$ on the maximum order that can be estimated for $x$ , and we show that, in fact, this simple bound is not far from tight. For general context trees, we derive an asymptotic upper bound, $n^{1/2 - o(1)}$ , on the estimated depth, and we exhibit explicit input sequences that asymptotically attain the bound up to a multiplicative constant factor. We show that a similar upper bound applies also to MDL estimators based on the KT probability assignment and, moreover, the same example sequences asymptotically approach the upper bound also in this case.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.