Abstract

The Partition Markov Model characterizes the process by a partition L of the state space, where the elements in each part of L share the same transition probability to an arbitrary element in the alphabet. This model aims to answer the following questions: what is the minimal number of parameters needed to specify a Markov chain and how to estimate these parameters. In order to answer these questions, we build a consistent strategy for model selection which consist of: giving a size n realization of the process, finding a model within the Partition Markov class, with a minimal number of parts to represent the process law. From the strategy, we derive a measure that establishes a metric in the state space. In addition, we show that if the law of the process is Markovian, then, eventually, when n goes to infinity, L will be retrieved. We show an application to model internet navigation patterns.

Highlights

  • The Markov models have received enormous visibility for being powerful tools [1,2,3]

  • Under the assumption of this family, we address the problem of model selection, showing that the model can be selected consistently using the Bayesian Information Criterion (BIC)

  • The development of the partition concept in Markov processes allows for proving that, for a stationary, finite memory process and a sample large enough, it is theoretically possible to consistently find a minimal partition to represent the process and this can be accomplished in practice

Read more

Summary

Introduction

The Markov models have received enormous visibility for being powerful tools [1,2,3]. [4] shows that the Bayesian Information Criterion (BIC)—[5]—can be used to consistently choose a Variable Length Markov Chain model in an efficient way using the Context. The Partition Markov Models are being used and explored intensively: for instance, [12] combines two statistical concepts—Copulas and Partition Markov Models—with the purpose of defining a natural correction for the estimator of the transition probabilities of a multivariate Markov process. We introduce a distance between the parts of a partition, and this concept defines a metric on the state space and allows it to build efficient algorithms for estimating the optimal partition (see [10]). The proof of the results introduced in this paper are included in Appendixes A and B

Preliminaries
Consistent Estimation through the Bayesian Information Criterion
A Metric on the State Space
Consistent Estimation of the Process’s Partition
Method
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call