Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: theory: categorical probability

Topic: Axiomatisation of probabilistic Boolean circuits


view this post on Zulip Robin Piedeleu (Aug 28 2024 at 06:47):

Hi! @Mateo Torres-Ruiz , Alexandra Silva, @Fabio Zanasi , and I have just uploaded an arXiv preprint which might be of interest to subscribers of this stream, in particular to those who care about axiomatising probability theory in Markov categories (though it is written mainly for a programming language theory audience).

One of our main results is a presentation, using generators and equations, of the Markov category of stochastic maps between objects that are powers of the two-element set, with the cartesian product as monoidal product. We give this presentation using string diagrams: these can be thought of as probabilistic circuits, i.e., Boolean circuits with additional primitives for coin flips (Bernoulli distributions) and explicit conditioning. Conditioning-free circuits correspond to those circuits that can be discarded (we call them causal circuits) and their semantics lands in FinStoch. Another way to formulate what I said above is that causal circuits modulo the equational theory we give in the paper are equivalent to the full monoidal subcategory of (FinStoch,×\times) on powers of the Booleans. This means that any two circuits that denote the same stochastic map can be shown to be equal using only equational reasoning.

Circuits that include conditioning operations may contain unsatisfiable conditions however (this is the usual problem with conditioning on events of measure zero), and thus require a more sophisticated semantics. There are several ways of dealing with this issue in the literature on probabilistic programming: in this paper, we chose to interpret circuits with conditioning in a category of sub-stochastic maps (the distributions involved sum to at most 1), quotiented by the equivalence relation that identifies them whenever they are proportional (this was first proposed as categorical semantics in the work of Dario Stein and Sam Staton afaik). Our second main result is a presentation of this monoidal category, once again on objects that are powers of the Booleans. Interestingly, it is not Markov, since not all its morphisms are causal (conditioning is not, for example), but it is compact-closed. In fact, it is a hypergraph category with copy, delete, conditioning, and the uniform distribution giving the required Frobenius algebra on the Booleans.

view this post on Zulip Cole Comfort (Aug 28 2024 at 14:58):

This is cool!

Do you know if the subcategory of spans of finite sets whose objects are powers of 2 embed within this category of sub-stochastic maps? I am curious, because I am interested in how nondeterminism and probability can be made to interact.

view this post on Zulip Robin Piedeleu (Aug 28 2024 at 15:15):

Hmm... spans of finite sets correspond to matrices whose entries are natural numbers, so perhaps you could map such a matrix AA to (the equivalence class) of the substochastic map/matrix whose columns are the re-normalised columns of AA (and the same column when AA has an all-zeroes column). This will not give you a faithful functor though, if it even gives you a functor at all.

view this post on Zulip Robin Piedeleu (Aug 29 2024 at 05:49):

I'm also not sure why an embedding of spans would count as combining nondeterminism and probability in your opinion. The classic issue with combining these two effects this is that there is no distributive law between the powerset monad and the distribution monad. However, there are alternatives, like the convex powerset monad which can be seen to combine them in a more limited sense. Is that the sort of thing you're looking for? Alternatively, the multiset monad does distribute over the distribution monad, iirc, and the Kleisli category of the multiset monad is equivalent to the category of spans of finite sets (with finite apex as well), so you might want to look into that instead.

view this post on Zulip Robin Piedeleu (Aug 29 2024 at 06:07):

Yet another approach is the one proposed by @Dario Stein and @Richard Samuelson in A Category for unifying Gaussian Probability and Nondeterminism. They define an extended Gaussian over some vector space VV as a Gaussian distribution on a quotient space of V/DV/D, where the subspace DD can be thought of as a "nondeterministic fiber". (And, as you know, we've axiomatised this category and more in Graphical Quadratic Algebra).

The same idea could perhaps be applied in the discrete setting? We could define an extended distribution over some (finite) set XX as a pair of an equivalence relation over XX and a distribution over its equivalence classes.

view this post on Zulip Robin Piedeleu (Aug 29 2024 at 06:31):

But I'm not sure how to extend this idea to define a category of extended stochastic maps. If we follow the idea of the paper above, we should define an extended stochastic map XYX\rightarrow Y as a triple (,f,ϕ)(\sim, f,\phi) of an equivalence relation over YY, a map f:XY/f:X\rightarrow Y/\sim, and a probability distribution over Y/Y/\sim. However, with this definition we do not recover stochastic maps as a special case, since we do not have a distribution for each element of XX. The alternative would be to define extended stochastic maps XYX\to Y as stochastic maps of type XY/X\rightarrow Y/\sim, but I don't know how to compose these.

view this post on Zulip Dario Stein (Aug 29 2024 at 06:47):

Robin Piedeleu said:

Yet another approach is the one proposed by Dario Stein and Richard Samuelson in A Category for unifying Gaussian Probability and Nondeterminism. They define an extended Gaussian over some vector space VV as a Gaussian distribution on a quotient space of V/DV/D, where the subspace DD can be thought of as a "nondeterministic fiber". (And, as you know, we've axiomatised this category and more in Graphical Quadratic Algebra).

The same idea could perhaps be applied in the discrete setting? We could define an extended distribution over some (finite) set XX as a pair of an equivalence relation over XX and a distribution over its equivalence classes.

Hi @Robin Piedeleu , this is indeed an intriguing and natural idea and I've tried to make it work. Here's the story: We can focus on a simpler setup, namely pure nondeterminism without probability in Set. A map XYX \to Y shall be a pair (f,)(f,\sim) where f:XY/f : X \to Y/{\sim}. I called this a "copartial map" because a map into a quotient is somehow dual to a partial map (a map out of a subobject). This forms a category (composition by pushout), however it fails to be monoidal (or Markov) in the way you want it to be. If your monoidal structure is coproduct ++ then everything works fine (like in general categories of decorated cospans). However, to compare this with probability theory, you really want the cartesian product ×\times, because that's how you talk about joint distributions, and it's what FinStoch formalizes. It turns out ×\times does not give a valid monoidal structure on copartial maps, essentially because the relevant class of pushouts (one leg epi) does not commute with products. It does in vector spaces, which is what makes extended Gaussians work, but vector spaces have biproducts so you can't tell the difference between ++ and ×\times.

view this post on Zulip davidad (David Dalrymple) (Aug 29 2024 at 10:43):

My guess is that the semantic issue here (about “a distribution over a partition” not being a kind of epistemic state that is compatible with ×) can be illustrated as follows. Suppose I have the state of knowledge ((x≤1, x>1), (0.6, 0.4)) about x, and the state of knowledge ((y≤2, y>2), (0.7, 0.3)) about y. Then, my state of knowledge about (x,y) can not be expressed as ((x≤1∧y≤2, x≤1∧y>2, x>1∧y≤2, x>1∧y>2), (p₀₀,p₀₁,p₁₀,p₁₁)), because I don’t know whether x and y are independent, or how they might be dependent.

view this post on Zulip davidad (David Dalrymple) (Aug 29 2024 at 10:48):

My current guess of the most promising “epistemic state” is

view this post on Zulip Cole Comfort (Aug 29 2024 at 11:28):

Dario Stein said:

Robin Piedeleu said:

Yet another approach is the one proposed by Dario Stein and Richard Samuelson in A Category for unifying Gaussian Probability and Nondeterminism. They define an extended Gaussian over some vector space VV as a Gaussian distribution on a quotient space of V/DV/D, where the subspace DD can be thought of as a "nondeterministic fiber". (And, as you know, we've axiomatised this category and more in Graphical Quadratic Algebra).

The same idea could perhaps be applied in the discrete setting? We could define an extended distribution over some (finite) set XX as a pair of an equivalence relation over XX and a distribution over its equivalence classes.

Hi Robin Piedeleu , this is indeed an intriguing and natural idea and I've tried to make it work. Here's the story: We can focus on a simpler setup, namely pure nondeterminism without probability in Set. A map XYX \to Y shall be a pair (f,)(f,\sim) where f:XY/f : X \to Y/{\sim}. I called this a "copartial map" because a map into a quotient is somehow dual to a partial map (a map out of a subobject). This forms a category (composition by pushout), however it fails to be monoidal (or Markov) in the way you want it to be. If your monoidal structure is coproduct ++ then everything works fine (like in general categories of decorated cospans). However, to compare this with probability theory, you really want the cartesian product ×\times, because that's how you talk about joint distributions, and it's what FinStoch formalizes. It turns out ×\times does not give a valid monoidal structure on copartial maps, essentially because the relevant class of pushouts (one leg epi) does not commute with products. It does in vector spaces, which is what makes extended Gaussians work, but vector spaces have biproducts so you can't tell the difference between ++ and ×\times.

What kind of quotient is this that gives you such a copartial map?

view this post on Zulip Robin Piedeleu (Aug 29 2024 at 11:30):

I think Dario just means a quotient given by an equivalence relation over YY. And the equivalence relation is part of the data of such a map.

view this post on Zulip Robin Piedeleu (Aug 29 2024 at 11:32):

davidad (David Dalrymple) said:

My current guess of the most promising “epistemic state” is

This looks interesting, but replacing partitions by [0,∞]-valued functions kind of defeats the purpose of combining nondeterminism with probability, doesn't it? At least, the idea of "nondeterminism" imo is that you have some information that you do not know how to quantify precisely. Maybe I misunderstood you point, though

view this post on Zulip davidad (David Dalrymple) (Aug 29 2024 at 11:49):

At least in the finitely generated case, any “imprecise probability distribution” (in the sense of https://arxiv.org/abs/2405.09391, i.e. a convex subset of probability space) is immediately representable in terms of upper bounds on expectations of [0,∞]-valued functions, by Minkowski-Weyl. That is, the purpose of this list of functions is to define the faces of the convex set. The imprecision comes from the fact that we do not know the expected values of every function (as Riesz-Markov would give us if we had a precise probability measure).

view this post on Zulip Robin Piedeleu (Aug 29 2024 at 11:53):

I see, thanks. This is more in line with the approach that uses the convex powerset monad I mentioned earlier, than trying to port what Dario and Richard did for Gaussians to the discrete case then.

view this post on Zulip Cole Comfort (Aug 29 2024 at 11:57):

Robin Piedeleu said:

Yet another approach is the one proposed by Dario Stein and Richard Samuelson in A Category for unifying Gaussian Probability and Nondeterminism. They define an extended Gaussian over some vector space VV as a Gaussian distribution on a quotient space of V/DV/D, where the subspace DD can be thought of as a "nondeterministic fiber". (And, as you know, we've axiomatised this category and more in Graphical Quadratic Algebra).

The same idea could perhaps be applied in the discrete setting? We could define an extended distribution over some (finite) set XX as a pair of an equivalence relation over XX and a distribution over its equivalence classes.

For Gaussian relations, the equivalence relation is given by the quotient of a vector space by an affine subspace. And this means that you don't have to worry about this problem involving the product vs the coproduct. So maybe it would be easier to drop the and gate and first try to generalize this to "discrete" vector spaces rather than merely finite sets; ie. vector spaces over finite fields.

What is really nice about Gaussian relations is that they are generated under relational composition (if represented in the right way) by the unital variance, centered normal distribution on the real line in addition to all the affine relations over the real numbers. What is perhaps even nicer, is that one obtain Gaussian relations from affine relations via a universal construction, by freely discarding the rotations (essentially).

Therefore it seems natural to ask what happens when one freely discards the "rotations" (if it makes sense to talk about SO(n,Fq){\sf SO}(n,\mathbb{F}_q) ) in the category of affine relations over finite fields... or really over general fields. However, I have no idea if this uniquely determines some discrete probability distribution.

view this post on Zulip davidad (David Dalrymple) (Aug 29 2024 at 12:01):

davidad (David Dalrymple) said:

At least in the finitely generated case, any “imprecise probability distribution” (in the sense of https://arxiv.org/abs/2405.09391, i.e. a convex subset of probability space) is immediately representable in terms of upper bounds on expectations…

I’m not confident about how well this idea generalizes to infinite state spaces, and this is something I’ll be looking to encourage researchers in my ARIA programme to look into. Some particular directions of relatively well-behaved but more general-enough spaces include the Perrone-Fritz setting of tau-smooth probability measures on Top, Huot-Staton on 𝜔PAP, and whatever Goubault-Larrecq is up to here: https://dl.acm.org/doi/abs/10.1145/3611660