We define a core calculus for the purpose of investigating reasoning principles of probabilistic programming languages. By using a variation of a technique called higher-order abstract syntax (HOAS), which is common in the implementation of domain-specific languages, the calculus captures the semantics of a stochastic language with observation while being agnostic to the details of its deterministic portions. By remaining agnostic to the non-stochastic portions of the language, this style of semantics enables the discovery of general reasoning principles for the principled manipulation of probabilistic program fragments by programmers, compilers, and analysis tools. This generality allows us to reason about probabilistic program fragments without the need to resort to the underlying measure theory in every instance, by instead enabling reasoning in terms of the core calculus in a way that we believe to be applicable to various surface-level languages.

Theo Giannakopoulos (BAE Systems) and Mitchell Wand & Andrew Cobb (Northeastern University)

**Q&A**:

- Q: Why develop two core calculi?

A: Using a calculus that does not support recursion makes the initial discovery process easier, since it avoids the technical overhead of dealing with recursion. As long as we can make progress within the non-recursive calculus, we would prefer to do so, especially when it seems that our results do not rely on the lack of recursion. However, most PPLs support recursion and looping constructs, and so we will eventually need to be working in a calculus that includes those constructs. - Q: Why use HOAS trees to express the uniform dist-equivalence property?

A: The property can be expressed independently of the surface language, e.g. by working directly in Panangaden’s category of Markov kernels. However, having a surface language is convenient for expressing properties and for providing a framework for later introducing additional constructs. - Q: Why can’t be expressed in terms of , while and can?

A: is a way to express a “measure literal” in the language. In particular, is the embedding of the zero measure and is the embedding of the indicator function for 42. On the other hand, is an operator which combines two measures, and so is not a measure itself.