Abstract

The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics.

Highlights

  • Bayesian nonparametric (BNP) models have been successfully applied to a wide range of domains but despite significant improvements in computational hardware, statistical inference in most BNP models remains infeasible in the context of large datasets, or for moderate-sized datasets where computational resources are limited

  • We concentrate on inference for the Dirichlet process mixture model (DPMM) and for the infinite hidden Markov model (Beal et al, 2002) but our arguments are more general and can be extended to many BNP models

  • This small variance asymptotic (SVA) reasoning breaks many of the key properties of the underlying probabilistic model: small variance asymptotics (SVA) applied to the DPMM (Kulis and Jordan, 2012; Jiang et al, 2012) loses the rich-get-richer effect of the infinite clustering, as the prior term over the partition drops from the likelihood; and degeneracy in the likelihood forbids any kind of rigorous out-of-sample prediction and for example, cross-validation

Read more

Summary

Introduction

Bayesian nonparametric (BNP) models have been successfully applied to a wide range of domains but despite significant improvements in computational hardware, statistical inference in most BNP models remains infeasible in the context of large datasets, or for moderate-sized datasets where computational resources are limited. By making some additional simplifying assumptions, this approach reduces MCMC updates to a fast optimization algorithm that converges quickly to an approximate MAP solution This small variance asymptotic (SVA) reasoning breaks many of the key properties of the underlying probabilistic model: SVA applied to the DPMM (Kulis and Jordan, 2012; Jiang et al, 2012) loses the rich-get-richer effect of the infinite clustering, as the prior term over the partition drops from the likelihood; and degeneracy in the likelihood forbids any kind of rigorous out-of-sample prediction and for example, cross-validation.

Collapsed Gibbs sampling for Dirichlet process mixtures
Introducing MAP-DP: A novel approximate MAP algorithm for collapsed DPMMs
Inference
Out-of-sample prediction
Synthetic CRP parameter estimation
UCI datasets
MAP-DP for infinite hidden Markov models
Gibbs sampler
MAP-DP for iHMMs
SVA for iHMMs
Synthetic study
MAP-DP for semiparametric mixed effects models
English longitudinal survey of ageing
Findings
Discussion and future directions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.