Abstract

Markov chains with variable length are useful stochastic models for data compression that avoid the curse of dimensionality faced by full Markov chains. In this article we introduce a variable length Markov chain whose transition probabilities depend not only on the state history but also on exogenous covariates through a generalized linear model. The goal of the proposed procedure is to estimate not only the context of the process, that is, the history of the process that is relevant for predicting the next state, but also the coefficients corresponding to the significant exogenous variables. The proposed method is consistent in the sense that the probability that the estimated context and the coefficients are equal to the true data generating mechanism tends to 1 as the sample size increases. Simulations suggest that, when covariates do contribute to the transition probabilities, the proposed procedure can recover both the tree structure and the regression parameters. It outperforms variable length Markov chains when covariates are present while yielding comparable results when covariates are absent. For models with fixed length, the accuracy of the proposed algorithm in recovering the true data generating mechanism is close to the methods available in the literature. The proposed methodology is used to predict the gains and losses of the Hang Seng index based on its own history and three large stock market indices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call