Abstract
BNPmix is an R package for Bayesian nonparametric multivariate density estimation, clustering, and regression, using Pitman-Yor mixture models, a flexible and robust generalization of the popular class of Dirichlet process mixture models. A variety of model specifications and state-of-the-art posterior samplers are implemented. In order to achieve computational efficiency, all sampling methods are written in C++ and seamless integrated into R by means of the Rcpp and RcppArmadillo packages. BNPmix exploits the ggplot2 capabilities and implements a series of generic functions to plot and print summaries of posterior densities and induced clustering of the data.
Highlights
Bayesian nonparametric (BNP) methods provide flexible solutions to complex problems and data which are not described by parametric models (Hjort, Holmes, Müller, and Walker 2010; Müller, Quintana, Jara, and Hanson 2015)
In order to clarify which features are specific to BNPmix and which are shared by other packages, we review state-of-the-art R packages for BNP inference via Markov chain Monte Carlo (MCMC)
The BNPmix package consists of three main R functions, wrappers of C++ routines which implement the BNP models described in Section 2 and the MCMC simulation methods introduced in Section 3, along with some user-friendly functions which facilitate the elicitation of prior distributions and the post-processing of generated posterior samples
Summary
Bayesian nonparametric (BNP) methods provide flexible solutions to complex problems and data which are not described by parametric models (Hjort, Holmes, Müller, and Walker 2010; Müller, Quintana, Jara, and Hanson 2015). The DPpackage by Jara, Hanson, Quintana, Müller, and Rosner (2011) is probably the most comprehensive of the packages we considered It is mainly written in Fortran and consists of a rich collection of functions implementing some of the most successful Bayesian nonparametric and semi-parametric models, including DP and dependent Dirichlet process (DDP) mixtures, hierarchical DP, Pólya trees, and random Bernstein polynomials. At the same time, when the focus is on the use of the PY process, BNPmix plays a leading role It is worth mentioning the increasing attention recently dedicated by the BNP literature to variational methods approximating the posterior distribution (Blei and Jordan 2006; Hughes, Kim, and Sudderth 2015; Campbell, Straub, Fisher III, and How 2015; Tank, Foti, and Fox 2015): the availability of R packages implementing such approach for BNP models is rather limited though, a notable exception being the package MixDir (Ahlmann-Eltze and Yau 2018) which implements a hierarchical DP mixture of multinomial kernels. A further comparison with other R packages for BNP inference and technical details on the parametrization of the implemented models are provided in the appendix
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have