Abstract

In this article, we aim to estimate the parameters of Poisson-Dirichlet mixture model with multigroup data structure by empirical Bayes. The number of mixture components with Bayesian nonparametric process priors is not fixed in advance and it can grow with the increase of data. Empirical Bayes is the useful method to estimate the mixture components without information on them in advance. We give the procedure to construct smooth estimates of base distribution G 0 and estimates of the two parameters α , θ . The performances of estimations for parameters under multigroup data are better than those of the single-group data with the same total size of individuals in the perspectives of bias, standard deviations, and mean squared errors by numerical simulation. Also, we applied Poisson-Dirichlet mixture models to well-known real datasets.

Highlights

  • Mixture models are typically used to model data in which each observation belongs to one of a set of distributions, being finite or infinite, with certain probabilities

  • In finite mixture models (FMM), the number of unknown distributions is supposed to be known and fixed and every individual is sampled from one of those distributions in different probability, whereas, in Dirichlet process (DP) mixture models, the number of clusters is not needed to be given in advance and inferred based on the data. e number of clusters in a DP mixture model is random and grows in log(sample size). e research efforts on DP mixture model in Bayesian nonparametric are quite large, which can be found in, for Mathematical Problems in Engineering example, the works of Ferguson [9], MacEachern and Muller [10], Neal [11], Ishwaran and James [12], Quintana et al [13], and so on

  • Some simulations are reported to show the performance of estimates provided above. e data structure of the PoissonDirichlet process (PDP) mixture model is according to equation (3) and the simulations were conducted under the two settings

Read more

Summary

Introduction

Mixture models are typically used to model data in which each observation belongs to one of a set of distributions, being finite or infinite, with certain probabilities. In FMM, the number of unknown distributions is supposed to be known and fixed and every individual is sampled from one of those distributions in different probability, whereas, in DP mixture models, the number of clusters is not needed to be given in advance and inferred based on the data. E individuals of every group follow the random probability distribution in view of the base We handle this by the nonparametric empirical Bayes; and the estimations of α, θ are referred to in the works, among others, of Carlton [18] and Favaro et al [19]. That is, taking values η∗k with probability mk − α/n + θ and a new one from G0 with probability Kα + θ/n + θ. is is the generalized Polya urn scheme to produce ηi, i 1, 2,

A PDP mixture model is defined by
Simulation
Applications to Real Data
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.