Abstract

We introduce mixtures of species sampling sequences (mSSS) and discuss how these sequences are related to various types of Bayesian models. As a particular case, we recover species sampling sequences with general (not necessarily diffuse) base measures. These models include some “spike-and-slab” non-parametric priors recently introduced to provide sparsity. Furthermore, we show how mSSS arise while considering hierarchical species sampling random probabilities (e.g., the hierarchical Dirichlet process). Extending previous results, we prove that mSSS are obtained by assigning the values of an exchangeable sequence to the classes of a latent exchangeable random partition. Using this representation, we give an explicit expression of the Exchangeable Partition Probability Function of the partition generated by an mSSS. Some special cases are discussed in detail—in particular, species sampling sequences with general base measures and a mixture of species sampling sequences with Gibbs-type latent partition. Finally, we give explicit expressions of the predictive distributions of an mSSS.

Highlights

  • Discrete random measures have been widely used in Bayesian nonparametrics

  • Let ξ be as in Definition 2 and let ( Zn0 )n and Πn be the sequence of exchangeable random variables and the exchangeable random partition appearing in Proposition 3

  • We have defined a new class of exchangeable sequences, called mixtures of species sampling sequences

Read more

Summary

Introduction

Discrete random measures have been widely used in Bayesian nonparametrics. Noteworthy examples of such random measures are the Dirichlet process [1], the Pitman–Yor process [2,3], (homogeneous) normalized random measures with independent increments (see, e.g., [4,5,6,7]), Poisson–Kingman random measures [8] and stick-breaking priors [9]. The diffuseness of H is assumed to define the so-called species sampling sequences [15], exchangeable sequences whose directing measure is a discrete random probability of type (1). In this case, the diffuseness of H is motivated by the interpretation of species sampling sequences as sequences describing a sampling mechanism in discovering species from an unknown population. This result is achieved considering two EPPFs arising from a suitable latent partition structure.

Exchangeable Random Partitions
Species Sampling Models
Definitions and Relation to Other Models
Representation Theorems for mSSS
Random Partitions Induced by mSSS
Explicit Expression of the EPPF
EPPF When Π Is of Gibbs Type
EPPF for gSSS with Spike-and-Slab Base Measure
Some General Results
Predictive Distributions for gSSS
Conclusions and Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call