Abstract

Bayesian nonparametric density estimation is dominated by single-scale methods, typically exploiting mixture model specifications, exception made for Pólya trees prior and allied approaches. In this paper we focus on developing a novel family of multiscale stick-breaking mixture models that inherits some of the advantages of both single-scale nonparametric mixtures and Pólya trees. Our proposal is based on a mixture specification exploiting an infinitely deep binary tree of random weights that grows according to a multiscale generalization of a large class of stick-breaking processes; this multiscale stick-breaking is paired with specific stochastic processes generating sequences of parameters that induce stochastically ordered kernel functions. Properties of this family of multiscale stick-breaking mixtures are described. Focusing on a Gaussian specification, a Markov Chain Monte Carlo algorithm for posterior computation is introduced. The performance of the method is illustrated analyzing both synthetic and real datasets consistently showing competitive results both in scenarios favoring single-scale and multiscale methods. The results suggest that the method is well suited to estimate densities with varying degree of smoothness and local features.

Highlights

  • Nonparametric models have well-known advantages for their weak set of assumptions and great flexibility in a variety of situations

  • For the higher sample size of n = 500, all the methods improve in terms of precision with our multiscale mixture (MSM) always performing slightly better than DP mixture (DPM) and smoothed approximate Pólya tree (SAPT) and with a substantial improvement of hierarchical infinite mixture (HIM) over all the methods

  • These results show that both our MSM approach and HIM are able to adapt to the actual smoothness of the density

Read more

Summary

Introduction

Nonparametric models have well-known advantages for their weak set of assumptions and great flexibility in a variety of situations. Bayesian nonparametrics (BNP) has received abundant attention in the last decades and it is nowadays a well-established modeling option in the data scientist’s toolbox. If standard parametric Bayesian inference focuses on the posterior distribution obtained by defining suitable prior distributions over a finite dimensional parametric space Ξ with ξ ∈ Ξ typically characterizing a specific parametric distribution Gξ for data y = Yn), in BNP one defines prior distributions on infinite-dimensional probability spaces flexibly characterizing the data distribution G

13 Page 2 of 13
Multiscale stick-breaking mixture
Multiscale kernel’s parameters
Location parameters
Scale parameters
Multiscale mixture of Gaussians
Multiscale mixture of other kernels
Posterior computation
Illustrations
The role of ı
Comparison with alternative methods
13 Page 8 of 13
Roeder’s galaxy speed data
Sloan Digital Sky Survey data
13 Page 12 of 13
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call