Abstract

The Generalized Chinese Restaurant Process (GCRP) describes a sequence of exchangeable random partitions of the numbers $\{1,\dots ,n\}$ . This process is related to the Ewens sampling model in Genetics and to Bayesian nonparametric methods such as topic models. In this paper, we study the GCRP in a regime where the number of parts grows like nα with α > 0. We prove a non-asymptotic concentration result for the number of parts of size $k=o(n^{\alpha /(2\alpha +4)}/(\log n)^{1/(2+\alpha )})$ . In particular, we show that these random variables concentrate around ckV∗nα where V∗nα is the asymptotic number of parts and ck ≈ k−(1+α) is a positive value depending on k. We also obtain finite-n bounds for the total number of parts. Our theorems complement asymptotic statements by Pitman and more recent results on large and moderate deviations by Favaro, Feng and Gao.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call