Sampling from Dirichlet process mixture models with unknown concentration parameter: mixing issues in large data implementations.

Silvia Liverani,Sylvia Richardson,David I Hastie

doi:10.1007/s11222-014-9471-3

Silvia Liverani, Sylvia Richardson + Show 1 more

Open Access

https://doi.org/10.1007/s11222-014-9471-3

Copy DOI

Abstract

We consider the question of Markov chain Monte Carlo sampling from a general stick-breaking Dirichlet process mixture model, with concentration parameter alpha . This paper introduces a Gibbs sampling algorithm that combines the slice sampling approach of Walker (Communications in Statistics - Simulation and Computation 36:45–54, 2007) and the retrospective sampling approach of Papaspiliopoulos and Roberts (Biometrika 95(1):169–186, 2008). Our general algorithm is implemented as efficient open source C++ software, available as an R package, and is based on a blocking strategy similar to that suggested by Papaspiliopoulos (A note on posterior sampling from Dirichlet mixture models, 2008) and implemented by Yau et al. (Journal of the Royal Statistical Society, Series B (Statistical Methodology) 73:37–57, 2011). We discuss the difficulties of achieving good mixing in MCMC samplers of this nature in large data sets and investigate sensitivity to initialisation. We additionally consider the challenges when an additional layer of hierarchy is added such that joint inference is to be made on alpha . We introduce a new label-switching move and compute the marginal partition posterior to help to surmount these difficulties. Our work is illustrated using a profile regression (Molitor et al. Biostatistics 11(3):484–498, 2010) application, where we demonstrate good mixing behaviour for both synthetic and real examples.Electronic supplementary materialThe online version of this article (doi:10.1007/s11222-014-9471-3) contains supplementary material, which is available to authorized users.

Highlights

Fitting mixture distributions to model some observed data is a common inferential strategy within statistical modelling, used in applications ranging from density estimation to regression analysis
The use of the Dirichlet process in the context of mixture modelling is the basis of this paper and we shall refer to the underlying model as the Dirichlet process mixture model, or Dirichlet process mixture models (DPMM) for brevity
In this paper we focus on Dirichlet process mixture models (DPMM), based upon the following constructive definition of the Dirichlet process, due to Sethuraman (1994)

Summary

Introduction

Fitting mixture distributions to model some observed data is a common inferential strategy within statistical modelling, used in applications ranging from density estimation to regression analysis. While the continual evolution of samplers might implicitly suggest potential shortcomings of previous samplers, new methods are often illustrated on synthetic or low dimensional datasets which can mask issues that might arise when using the method on problems of even modest dimension It appears that little explicit discussion has been presented detailing the inherent difficulties of using a Gibbs (or Metropolis-within-Gibbs) sampling approach to update such a complex model space, there are some exceptions, for example Jain and Neal (2007), in the context of adding additional split-merge type moves into their sampler. For real (rather than synthetic) data applications of the DPMM, the state space can be highly multimodal, with well separated regions of high posterior probability coexisting, often corresponding to clusterings with different number of components We demonstrate that such highly multimodal spaces present difficulties for the existing sampling methods to escape the local modes, with poor mixing resulting in inference that is influenced by sampler initialisation.

Dirichlet process mixture models

Sampling from the DPMM

An example model

Discrete covariates with binary response

Simulated datasets

Mixing of MCMC algorithms for the DPMM

Initial number of clusters

Monitoring convergence

Marginal partition posterior

Our implementation of a DPMM sampler

An optimal partition

Making predictions

Investigation of the algorithm’s properties in a large data application

The data

Posterior distribution of α

Label-switching moves

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Statistics and computing	Publication Date: May 3, 2014
Citations: 65	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Sampling from Dirichlet process mixture models with unknown concentration parameter: mixing issues in large data implementations.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Statistics and computing

Lead the way for us

Similar Papers

Variational inference for Dirichlet process mixtures
David M Blei ... Michael I Jordan
Bayesian Analysis | VOL. 1
David M Blei, et. al.David M Blei ... Michael I Jordan
01 Mar 2006
Bayesian Analysis | VOL. 1

A Dirichlet process mixture model for automatic18F-FDG PET image segmentation: Validation study on phantoms and on lung and esophageal lesions
Marco Ferdeghini ... Maria Grazia Giri
Medical Physics | VOL. 43
Marco Ferdeghini, et. al.Marco Ferdeghini ... Maria Grazia Giri
26 Apr 2016
Medical Physics | VOL. 43

Unsupervised Tracking With the Doubly Stochastic Dirichlet Process Mixture Model
Nelson H C Yung ... Edmund Y Lam
IEEE Transactions on Intelligent Transportation Systems | VOL. 17
Nelson H C Yung, et. al.Nelson H C Yung ... Edmund Y Lam
01 Sep 2016
IEEE Transactions on Intelligent Transportation Systems | VOL. 17

Malware Detection Using Nonparametric Bayesian Clustering and Classification Techniques
Blake Anderson ... Curtis Storlie
Technometrics | VOL. 57
Blake Anderson, et. al.Blake Anderson ... Curtis Storlie
02 Oct 2015
Technometrics | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sampling from Dirichlet process mixture models with unknown concentration parameter: mixing issues in large data implementations.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Statistics and computing