Anchored Bayesian Gaussian mixture models

Deborah Kunkel,Mario Peruggia

doi:10.1214/20-ejs1756

Abstract

Finite mixtures are a flexible modeling tool for irregularly shaped densities and samples from heterogeneous populations. When modeling with mixtures using an exchangeable prior on the component features, the component labels are arbitrary and are indistinguishable in posterior analysis. This makes it impossible to attribute any meaningful interpretation to the marginal posterior distributions of the component features. We propose a model in which a small number of observations are assumed to arise from some of the labeled component densities. The resulting model is not exchangeable, allowing inference on the component features without post-processing. Our method assigns meaning to the component labels at the modeling stage and can be justified as a data-dependent informative prior on the labelings. We show that our method produces interpretable results, often (but not always) similar to those resulting from relabeling algorithms, with the added benefit that the marginal inferences originate directly from a well specified probability model rather than a post hoc manipulation. We provide asymptotic results leading to practical guidelines for model selection that are motivated by maximizing prior information about the class labels and demonstrate our method on real and simulated data.

Highlights

Finite mixture models are flexible tools that are often applied to data from heterogeneous populations or from distributions with irregularly-shaped densities
We introduce a modification to the standard finite mixture model, the anchor model, in which a small number of observations are assumed to be drawn from known component densities
We build on these ideas by formalizing this strategy as a modeling procedure that requires no post-processing of an MCMC sample

Summary

Introduction

Finite mixture models are flexible tools that are often applied to data from heterogeneous populations or from distributions with irregularly-shaped densities. Much work has been devoted to either preventing or reversing label-switching by placing prior constraints on the parameter space or by post-processing posterior samples in a way that allows only one possible labeling of the mixture components. Because its constraints are not the result of a clearly defined prior specification, it is difficult to evaluate rigorously the underlying structure that the relabeling algorithm imposes upon a problem It is not obvious whether this approach can be justified as a basis for making inferential claims about the posterior distribution of the componentspecific parameters. We introduce a modification to the standard finite mixture model, the anchor model, in which a small number of observations are assumed to be drawn from known component densities This breaks the model’s label invariance in a data-dependent manner while avoiding the strong, subjective restrictions imposed by prior identifiability constraints.

Anchor models

Definition of an anchor model

Basic properties

Model evidence

Large sample properties and quasi-consistency

Choosing the number of anchor points

Anchored EM algorithm for selecting anchor points

Other strategies for model specification

Sampling

Galaxies

Simulated data

Model 3

A multivariate example: fall detection data

Discussion

A Proofs of the propositions

B Simulation study

Simulation 2

C Anchored EM algorithm

D Random permutation sampler

F Analysis of the SisFall data

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronic Journal of Statistics	Publication Date: Jan 1, 2020
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

Anchored Bayesian Gaussian mixture models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics

Lead the way for us

Similar Papers

Bayesian mixture model based clustering of replicated microarray data
M Medvedovic ... R.E Bumgarner
Bioinformatics | VOL. 20
M Medvedovic, et. al.M Medvedovic ... R.E Bumgarner
10 Feb 2004
Bioinformatics | VOL. 20

IoT Anomaly Detection Based on Autoencoder and Bayesian Gaussian Mixture Model
Yunyun Hou ... Yangrui Yang
Electronics | VOL. 11
Yunyun Hou, et. al.Yunyun Hou ... Yangrui Yang
12 Oct 2022
Electronics | VOL. 11

Bayesian Mixture Models on Connected Components for Newspaper Article Segmentation
Giorgos Sfikas ... Georgios Louloudis
-
Giorgos Sfikas, et. al.Giorgos Sfikas ... Georgios Louloudis
13 Sep 2016
13 Sep 2016

An online Bayesian mixture labelling method by minimizing deviance of classification probabilities to reference labels
Weixin Yao ... Longhai Li
Journal of Statistical Computation and Simulation | VOL. 84
Weixin Yao, et. al.Weixin Yao ... Longhai Li
27 Jul 2012
Journal of Statistical Computation and Simulation | VOL. 84

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Anchored Bayesian Gaussian mixture models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics