Posterior contraction rate of sparse latent feature models with application to proteomics

Tong Li,Tianjian Zhou,Kam-Wah Tsui,Lin Wei,Yuan Ji

doi:10.1080/24754269.2021.1974664

Abstract

The Indian buffet process (IBP) and phylogenetic Indian buffet process (pIBP) can be used as prior models to infer latent features in a data set. The theoretical properties of these models are under-explored, however, especially in high dimensional settings. In this paper, we show that under mild sparsity condition, the posterior distribution of the latent feature matrix, generated via IBP or pIBP priors, converges to the true latent feature matrix asymptotically. We derive the posterior convergence rate, referred to as the contraction rate. We show that the convergence results remain valid even when the dimensionality of the latent feature matrix increases with the sample size, therefore making the posterior inference valid in high dimensional settings. We demonstrate the theoretical results using computer simulation, in which the parallel-tempering Markov chain Monte Carlo method is applied to overcome computational hurdles. The practical utility of the derived properties is demonstrated by inferring the latent features in a reverse phase protein arrays (RPPA) dataset under the IBP prior model.

Full Text