Abstract

Stan is a probabilistic programming language that has been increasingly used for real-world scalable projects. However, to make practical inference possible, the language sacrifices some of its usability by adopting a block syntax, which lacks compositionality and flexible user-defined functions. Moreover, the semantics of the language has been mainly given in terms of intuition about implementation, and has not been formalised. This paper provides a formal treatment of the Stan language, and introduces the probabilistic programming language SlicStan --- a compositional, self-optimising version of Stan. Our main contributions are (1) the formalisation of a core subset of Stan through an operational density-based semantics; (2) the design and semantics of the Stan-like language SlicStan, which facilities better code reuse and abstraction through its compositional syntax, more flexible functions, and information-flow type system; and (3) a formal, semantic-preserving procedure for translating SlicStan to Stan.

Highlights

  • 1.1 BackgroundProbabilistic Programming Languages and StanProbabilistic programming languages [Gordon et al 2014b] are a concise notation for specifying probabilistic models, while abstracting the underlying inference algorithm

  • This paper provides a formal treatment of the Stan language, and introduces the probabilistic programming language SlicStan Ð a compositional, self-optimising version of Stan

  • Stan’s syntax is designed to enable automatic compilation to an efficient Hamiltonian Monte Carlo (HMC) inference algorithm [Neal et al 2011], which allows programs to scale to real-word projects in statistics and data science. (For example, the forecasting tool Prophet [Taylor and Letham 2017] uses Stan.) This efficiency comes at a price: Stan’s syntax lacks the compositionality of other similar systems, such as Edward [Tran et al 2016] and PyMC3 [Salvatier et al 2016]

Read more

Summary

Background

Probabilistic programming languages [Gordon et al 2014b] are a concise notation for specifying probabilistic models, while abstracting the underlying inference algorithm. There are many such languages, including BUGS [Gilks et al 1994], JAGS [Plummer et al 2003], Anglican [Wood et al 2014], Church [Goodman et al 2012], Infer.NET [Minka et al 2014], Venture [Mansinghka et al 2014], Edward [Tran et al 2016] and many others. The design of Stan assumes that the programmer needs to organise their model into separate blocks, which correspond to different stages of the inference algorithm (preprocessing, sampling, postprocessing). It is difficult to write complex Stan programs and encapsulate distributions and sub-model structures into re-usable libraries

Goals and Key Insight
The Insight by Example
Core Contributions and Outline
CORE STAN
Syntax of Core Stan Expressions and Statements
Syntax of Stan
Density-Based Semantics of Stan
Inference
SLICSTAN
Syntax
Typing of SlicStan
Elaboration of SlicStan
Semantics of SlicStan
Examples
Difficulty of Specifying Direct Semantics Without Elaboration
TRANSLATION OF SLICSTAN TO STAN
Shredding
Transformation
EXAMPLES AND DISCUSSION
Type Inference
Locality
Code Reuse
RELATED WORK
Formalisation of Probabilistic Programming Languages
Static Analysis for Probabilistic Programming Languages
Usability of Probabilistic Programming Languages
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call