Abstract
This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of standard Markov Chain Monte Carlo (MCMC) sampling techniques, and the related label switching problem. An overfitting approach is used to estimate the number of components in a finite mixture model via a Zmix algorithm. Zmix provides a bridge between multidimensional samplers and test based estimation methods, whereby priors are chosen to encourage extra groups to have weights approaching zero. MCMC sampling is made possible by the implementation of prior parallel tempering, an extension of parallel tempering. Zmix can accurately estimate the number of components, posterior parameter estimates and allocation probabilities given a sufficiently large sample size. The results will reflect uncertainty in the final model and will report the range of possible candidate models and their respective estimated probabilities from a single run. Label switching is resolved with a computationally light-weight method, Zswitch, developed for overfitted mixtures by exploiting the intuitiveness of allocation-based relabelling algorithms and the precision of label-invariant loss functions. Four simulation studies are included to illustrate Zmix and Zswitch, as well as three case studies from the literature. All methods are available as part of the R package Zmix, which can currently be applied to univariate Gaussian mixture models.
Highlights
Finite mixture models naturally arise when homogeneous subgroups or clusters are thought to be present in a population, and can be used as flexible parametric models for estimating complex or unknown distributions [1]
The success of Zmix for order estimation is indicated by a close relationship between sample size and the underlying complexity of a mixture distribution to be overfitted
The algorithm is easy to implement and interpret, and requires only that a maximum number of components is specified and that this is larger than the expected upper bound of K0. It is based on the same basic format and conditional distributions as a standard Gibbs sampler on a single parametric model, with the addition of a range of prior hyperparameter values implemented in the Prior Parallel Tempering (PPT) algorithm
Summary
Finite mixture models naturally arise when homogeneous subgroups or clusters are thought to be present in a population, and can be used as flexible parametric models for estimating complex or unknown distributions [1]. Whether latent subgroups are present or not, their flexible framework has the potential to help tackle many research problems As such they are useful tools in many fields including but not limited to genetic and medical research [2,3,4], econometrics [5], and image and sound analysis, where mixtures are used to perform complex tasks such as object tracking and speaker identification [6, 7]. Despite their popularity, model estimation can be difficult when the number of components is unknown [8].
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have