Abstract

In Bayesian model selection, the deviance information criterion (DIC) has become a widely used criterion. It is however not defined for the hidden Markov models (HMMs). In particular, the main challenge of applying the DIC for HMMs is that the observed likelihood function of such models is not available in closed form. A closed form for the observed likelihood function can be obtained either by summing all possible hidden states of the complete likelihood using the so-called the forward recursion, or via integrating out the hidden states in the conditional likelihood. Hence, we propose two versions of the DIC to the model choice problem in HMMs context, namely, the recursive deviance-based DIC and the conditional likelihood-based DIC. In this paper, we compare several normal HMMs after they are estimated by Bayesian MCMC method. We conduct a simulation study based on synthetic data generated under two assumptions, namely diversity in the heterogeneity level and also the number of states. We show that the recursive deviance-based DIC performs well in selecting the correct model compared with the conditional likelihood-based DIC that prefers the more complicated models. A real application involving the waiting time of Old Faithful Geyser data was also used to check those criteria. All the simulations were conducted in Python v.2.7.10, available from first author on request.

Highlights

  • Hidden Markov Models (HMMs) have been used to model various types of data: discrete, continuous, univariate, multivariate, mixed and mixture data (MacDonald and Zucchini, 2009)

  • We have illustrated two main ways to obtain a closed form for the observed likelihood of HMMs

  • The first proposed criterion, the recursive deviance-based deviance information criterion (DIC), is based on the observed likelihood obtained by summing all possible states of the complete data likelihood using forward recursion

Read more

Summary

Introduction

Hidden Markov Models (HMMs) have been used to model various types of data: discrete, continuous, univariate, multivariate, mixed and mixture data (MacDonald and Zucchini, 2009). When fitting several HMMs to a data set, we seek to determine the number of hidden states of a model that adequately fits those data or more formally the best model among those competitive models Frequentest criteria such Akaike Information Criterion (AIC) (Akaike, 1973) and the Bayesian information criterion (BIC) (Schwarz, 1978) have been applied to the HMM model choice problem. A number of difficulties can be inherent in this criterion, especially in more complicated models, e.g. mixture and hidden Markov models, where latent (hidden) variables and model parameters are non-identifiable from data (Celeux et al, 2006) In this setting, Celeux et al (2006) introduced eight formulae for DICs which have been classified according to the nature of the likelihood used; observed, complete and conditional. The Data Augmentation approach (Tanner and Wong, 1987) is often used with MCMC methods in the Bayesian inference of complicated models such as HMMs, a closed form of the likelihood function of the HMMs can be available and the parameter estimation becomes more www.ccsenet.org/ijsp

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call