Abstract

Probabilistic programming is an area of research that aims to develop general inference algorithms for probabilistic models expressed as probabilistic programs whose execution corresponds to inferring the parameters of those models. In this paper, we introduce a probabilistic programming language (PPL) based on abductive logic programming for performing inference in probabilistic models involving categorical distributions with Dirichlet priors. We encode these models as abductive logic programs enriched with probabilistic definitions and queries, and show how to execute and compile them to boolean formulas. Using the latter, we perform generalized inference using one of two proposed Markov Chain Monte Carlo (MCMC) sampling algorithms: an adaptation of uncollapsed Gibbs sampling from related work and a novel collapsed Gibbs sampling (CGS). We show that CGS converges faster than the uncollapsed version on a latent Dirichlet allocation (LDA) task using synthetic data. On similar data, we compare our PPL with LDA-specific algorithms and other PPLs. We find that all methods, except one, perform similarly and that the more expressive the PPL, the slower it is. We illustrate applications of our PPL on real data in two variants of LDA models (Seed and Cluster LDA), and in the repeated insertion model (RIM). In the latter, our PPL yields similar conclusions to inference with EM for Mallows models.

Highlights

  • Probabilistic programming is an area of research that aims to develop general inference algorithms for probabilistic models expressed as probabilistic programs whose execution corresponds to inferring the parameters of the probabilistic model

  • We show how our probabilistic programming language (PPL) can be used to perform inference in two classes of probabilistic models: Latent Dirichlet Allocation (LDA, [12]), a well studied approach for topic modeling, including two variations thereof (Seed latent Dirichlet allocation (LDA) and Cluster LDA); and the repeated insertion model (RIM, [13]), a model used for preference modeling and whose generative story can be expressed using recursion

  • We show that inference in PB yields reasonable results given our intuition about the models, and in the case of the repeated insertion model, we reach similar conclusions compared to a different model

Read more

Summary

Introduction

Probabilistic programming is an area of research that aims to develop general inference algorithms for probabilistic models expressed as probabilistic programs whose execution corresponds to inferring the parameters of the probabilistic model. PRiSM is a PPL which introduces Dirichlet priors over categorical distributions and is deigned for efficient inference in models with non-overlapping explanations. We introduce a PPL based on logic programming for performing inference in probabilistic models involving categorical distributions with Dirichlet priors. We encode these models as abductive logic programs [10] enriched with probabilistic definitions and inference queries, such that the result of abduction allows overlapping explanations. We show how our PPL can be used to perform inference in two classes of probabilistic models: Latent Dirichlet Allocation (LDA, [12]), a well studied approach for topic modeling, including two variations thereof (Seed LDA and Cluster LDA); and the repeated insertion model (RIM, [13]), a model used for preference modeling and whose generative story can be expressed using recursion.

The probabilistic model
The uncollapsed PB model
The collapsed PB model
Syntax and semantics
Abductive logic programming and PB
Knowledge compilation and multiple observations
MCMC sampling
Uncollapsed Gibbs sampling
Collapsed Gibbs sampling
Evaluation
PB for cluster LDA on arXiv abstracts
PB for RIM on Sushi dataset
Related work
Conclusions and future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call