Abstract
Bayesian nonparametric density estimation is dominated by single-scale methods, typically exploiting mixture model specifications, exception made for Pólya trees prior and allied approaches. In this paper we focus on developing a novel family of multiscale stick-breaking mixture models that inherits some of the advantages of both single-scale nonparametric mixtures and Pólya trees. Our proposal is based on a mixture specification exploiting an infinitely deep binary tree of random weights that grows according to a multiscale generalization of a large class of stick-breaking processes; this multiscale stick-breaking is paired with specific stochastic processes generating sequences of parameters that induce stochastically ordered kernel functions. Properties of this family of multiscale stick-breaking mixtures are described. Focusing on a Gaussian specification, a Markov Chain Monte Carlo algorithm for posterior computation is introduced. The performance of the method is illustrated analyzing both synthetic and real datasets consistently showing competitive results both in scenarios favoring single-scale and multiscale methods. The results suggest that the method is well suited to estimate densities with varying degree of smoothness and local features.
Highlights
Nonparametric models have well-known advantages for their weak set of assumptions and great flexibility in a variety of situations
For the higher sample size of n = 500, all the methods improve in terms of precision with our multiscale mixture (MSM) always performing slightly better than DP mixture (DPM) and smoothed approximate Pólya tree (SAPT) and with a substantial improvement of hierarchical infinite mixture (HIM) over all the methods
These results show that both our MSM approach and HIM are able to adapt to the actual smoothness of the density
Summary
Nonparametric models have well-known advantages for their weak set of assumptions and great flexibility in a variety of situations. Bayesian nonparametrics (BNP) has received abundant attention in the last decades and it is nowadays a well-established modeling option in the data scientist’s toolbox. If standard parametric Bayesian inference focuses on the posterior distribution obtained by defining suitable prior distributions over a finite dimensional parametric space Ξ with ξ ∈ Ξ typically characterizing a specific parametric distribution Gξ for data y = Yn), in BNP one defines prior distributions on infinite-dimensional probability spaces flexibly characterizing the data distribution G
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have