Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: learning: questions

Topic: Category-theoretic probability

David Corfield (May 29 2025 at 10:00):

I find myself taking a dive into the world of [[category-theoretic approaches to probability theory]] and it's raising some questions for me. That nLab page, predominently written by @Paolo Perrone, talks of three main structures of interest: Markov categories, Probability monads, and Dagger categories.

Taking the second of these, I see that there's a desire to have some nicer category than $Meas$ , so that one might look at, say, the quasi-topos of [[quasi-Borel spaces]]. This would be for probabilistic programming purposes, cartesian closure obviously being a good thing.

But presumably we shouldn't see this line of research as competing with the Markov category approach, since at some point one will want to use the Kleisli category of some probability monad, and, as we hear at [[monads of probability, measures, and valuations]]

Kleisli categories of probability monads are often instances of Markov categories.

This is the case for the probability monad on $Qbs$ .

Might we say then that the best of both worlds comes in situations where a topos-like category of measurable spaces with a probability monad generates a Markov category through the Kleisli construction?

Sometimes Kleisli categories inherit some structure, such as here, where there are even examples of some being themselves toposes. Do the Kleisli categories of probability monads typically have some useful extra structure beyond being Markov derived from the structure of the original category?

Paolo Perrone (May 29 2025 at 10:15):

Yes, that's more or less what we want.
Kleisli categories of probability monads tend moreover to be representable Markov categories, which we particularly nice. (And they can be even nicer, such as observationallly representable.)

David Corfield (May 29 2025 at 10:52):

Thanks, Paolo! And I see the first of these is on the nLab at [[Representable Markov categories]].

Could you say briefly what observational representability adds?

Tobias Fritz (May 29 2025 at 10:54):

I think it's worth adding that having a "nice" category equipped with a probability monad (whatever this means precisely) is not sufficient for the purposes of probability theory, and it can in fact fall far short of it. Let me illustrate this with two examples, where the first one is rather obvious and the second may be quite surprising:

Set is a category that is as nice as it gets, and presumably the distribution monad counts as a probability monad. The resulting Markov category is good for the purposes of discrete probability theory, but obviously doesn't capture any measure-theoretic probability. Categorically, this manifests itself in the lack of [[Kolmogorov products]] for the Markov category. Since Kolmogorov products are necessary to talk about joint distributions of infinitely many random variables -- a core theme in probability -- we're lacking a core feature, and this happens although the category we started with was extremely nice.
Quasi-Borel spaces with their probability monad do not conform with the desiderata of probability theory either. As we've shown in Prop 3.3 of Dilations and information flow axioms in categorical probability, the Kleisli category fails to be a positive Markov category. In a positive Markov category, every distribution on a product space $X \times Y$ which has a deterministic marginal necessarily makes the two variables independent. This is a basic feature enjoyed by ordinary probability in both the discrete and measure-theoretic incarnations, but it fails for distributions on quasi-Borel spaces. I believe that this rules them out as a convenient category for probability.

Tobias Fritz (May 29 2025 at 11:33):

In other words, there are many additional conditions that one can impose on a Markov category in order for it to support some of the basic theorems of probability, as they have been proven in entirely categorical terms based on these conditions. And these conditions are quite unrelated to whether the subcategory of deterministic morphisms is cartesian closed or a topos.

It may also be interesting to note that these conditions are really just properties, such as representability; no extra structure is needed for any of the existing results (that I can think of right now).

Tobias Fritz (May 29 2025 at 11:36):

I don't want to deny that cartesian closure (of the category of deterministic morphisms) is important for probabilistic programming. But I do want to argue that other properties are more important than that still, because without these you don't even get anything that resembles probability!

David Corfield (May 29 2025 at 11:38):

Thanks! And nLab has [[positivity]] too (no doubt thanks to Paolo).

If $Qbs$ is lacking, as you explained, is there a leading candidate for a good subcategory of deterministic morphisms? One which would allow the good features of its probabilistic Markov category mentioned.

David Corfield (May 29 2025 at 11:47):

I should read further! There's a nice table at [[Markov category -- Detailed list]] where it seems that $BorelStoch$ is doing well. But then presumably $BorelMeas$ isn't so nice.

Tobias Fritz (May 29 2025 at 11:52):

There aren't any leading candidates that I'm aware of, but this doesn't mean much -- Paolo, @Sam Staton and @Sean Moss will know the start of the art on this problem better.

Tobias Fritz (May 29 2025 at 11:52):

Right, BorelStoch is our "default" Markov category for measure-theoretic probability, for essentially the same reasons as why probability theorists typically work with standard Borel spaces. That seems to work pretty well in the sense that essentially all measurable spaces that tend to occur in probability research (on both the pure and applied side) are standard Borel. But the deterministic subcategory BorelMeas is indeed not that nice: it only has countable products but not uncountable ones, and it's not cartesian closed. Relatedly, BorelStoch only has countable Kolmogorov products but not uncountable ones.

Paolo Perrone (May 29 2025 at 13:34):

About nice categories of deterministic morphisms: as proven in proposition 5.8 here, under reasonable assumptions, cartesian closed categories which have "something like the real line" tend not to be positive. So I feel there's a bit of a compromise necessary.

Paolo Perrone (May 29 2025 at 13:39):

About observational representability: it's true, there is not much on the nLab, and it's becoming such a standard notion that we should probably write something about it. (A stub is here, but more needs to be added.)
Roughly, it says that we can distinguish two probability distributions by sampling them independently many times, which is what people do (ideally) in statistics, science, etc.
Formally, this says that given the sampling map $\mathit{samp}:PX\to X$ (the counit of the Kleisli adjunction), the composite maps

$PX \xrightarrow{copy} (PX)^n \xrightarrow{samp^n} X^n$

for all $n$ form a jointly monic family.
For now it's explained in section 6 here and in appendix A.4 here.

David Corfield (May 29 2025 at 14:07):

Paolo Perrone said:

So I feel there's a bit of a compromise necessary.

I was getting that sense.

Antonio Lorenzin (May 29 2025 at 15:13):

Tobias Fritz said:

But the deterministic subcategory BorelMeas is indeed not that nice: it only has countable products but not uncountable ones, and it's not cartesian closed. Relatedly, BorelStoch only has countable Kolmogorov products but not uncountable ones.

Just to point it out, this should be solved by taking Baire measurable spaces (i.e. measurable spaces whose sigma-algebra is the Baire sigma-algebra induced by some compact Hausdorff topology). At the moment, we need to dive more deeply in this setting if we want to ensure this is a well-behaved extension of BorelMeas out of the separability condition.

Antonio Lorenzin (May 29 2025 at 15:16):

(btw this should solve the problem of products, but not that of cartesian closure!)

Madeleine Birchfield (May 29 2025 at 16:50):

Alex Simpson has three toposes in which to do probability theory in:

https://www.youtube.com/watch?v=Y1RkPhwJ0Mo

Paolo Perrone (May 29 2025 at 16:56):

Yes! But to be precise, Simpson is not exactly forming a topos of (say) measurable spaces and measurable maps, or the "deterministic morphisms" of a Markov category. He is defining a topos of sets or spaces of random variables. These are beautifully modeled as particular sheaves on a category of probability spaces (or on a similar site).
So it's a different topos altogether.
(There are tight links between Simpson's approach, the dagger-category approach, and Markov categories. See for example this recent work by Stein.)

David Corfield (May 29 2025 at 18:19):

Ah, this could be useful for me:

image.png

Tobias's A synthetic approach to Markov kernels, conditional independence and theorems on sufficient statistics

Tobias Fritz (May 29 2025 at 18:23):

I don't remember if I was aware of that at the time of writing the paper, but ideas like that (with stochastic processes in mind) already go back to Lawvere's Category of Probabilistic Mappings manuscript! (See the final few pages)

Tobias Fritz (May 29 2025 at 18:25):

Would you possibly have use for this in your work on the Safeguarded AI programme?

David Corfield (May 29 2025 at 18:32):

I've been working a bit with Vineet Rajani and Dominic Orchard, and it seems that the former's approach to cost analysis of programs is modelled well by presheaves over an ordered monoid, $[M, Set]$ , $M$ thin monoidal. We were looking for a model for the version for probabilistic programming, so I guess we could think about your construction which would allow a Markov category structure on these presheaves.

Tobias Fritz (May 29 2025 at 18:48):

Yep, that should work well as long as your presheaves take values in deterministic morphisms (while the morphisms of presheaves can have non-deterministic components). This determinism is necessary in order to get a Markov category again. This is related to why probability theorists tend to model stochastic processes in terms of filtrations of σ-algebras: the identity map is a measurable determinsitic morphism from a measurable space with a large σ-algebra into the same space with a smaller σ-algebra.

David Corfield (May 29 2025 at 18:50):

Tobias Fritz said:

...as long as your presheaves take values in deterministic morphisms...

Yes, that's it.

David Corfield (May 30 2025 at 08:52):

A couple of questions this morning:

(1) Do the diagrammatic Markov categories just mentioned always inherit the good qualities of the codomain Markov category (having conditionals, being representable, Kolmogorov products, etc.)?

(2) I see from the table of probability monads that in many cases the full Eilenberg–Moore category isn't known . Is there any sense that non-free algebras could have some importance for category-theoretic probability theory?

Paolo Perrone (May 30 2025 at 08:53):

For question 2: yes, and it has to do with expectation values. We're working on a paper about it, and it should be out soon. [I know I've been saying that for months, sorry! :sweat_smile:]

Morgan Rogers (he/him) (May 30 2025 at 10:05):

Madeleine Birchfield said:

Alex Simpson has three toposes in which to do probability theory in:

https://www.youtube.com/watch?v=Y1RkPhwJ0Mo

I missed this talk when it happened, I'm glad to have been reintroduced to it. I often complain that "applications" of topos theory rarely get further than constructing a single topos (or quasi-topos) to work inside and then leveraging the properties of that topos. There is a bigger picture to be found, and a closer examination of ways of comparing different categories used in categorical probability theory could connect things together much more firmly than the typical 'such and such used a similar but subtly different approach' that one sees in computer science papers all the time.

Tobias Fritz (May 30 2025 at 10:08):

As for (1), I think that the inheritance of representability and Kolmogorov products is fairly easy to see by doing it objectwise, athough AFAIK it hasn't been written up. Conditionals are much harder though, and I believe that it's still open whether conditionals are inherited e.g. by an arrow category. Since there is no obvious construction of conditionals in an arrow category from conditionals in the original category, it seems likely that it they won't be inherited in general. But it might well be the case in specific categories like the arrow category of BorelStoch.

Tobias Fritz (May 30 2025 at 10:13):

Concerning non-free Eilenberg-Moore algebras, they're surprisingly irrelevant in the sense that many algebras that many more algebras than one would naively expect turn out to be free. For example, it's quite obvious that idempotents split in the EM category of a (reasonably nice) probability monad. What's nontrivial, and what fails for many other kinds of monads, is that they even split in the Kleisli category! In other words, the splitting in the EM category is through a free algebra. We've shown this for BorelStoch here.

Another instance is limits of certain diagrams, like equalizers of group actions. Also in this case the limit quite obviously exists in the EM category, but the fact that it's a free algebra is much less obvious. This is one way to state the ergodic decomposition theorem! (See A category-theoretic proof of the ergodic decomposition theorem for a slightly weaker version modulo almost sure equality.)

David Corfield (Jun 19 2025 at 08:06):

Returning to the diagram category idea from above #learning: questions > Category-theoretic probability @ 💬, I guess one easy way to construct these is the situation where $C$ arises as the Kleisli category for some monad, $M$ on $C_{det}$ , and then the diagram category is the Kleisli category for $M$ extended to a monad on $[D, C_{det}]$ .

Since in the case I mentioned we are dealing with $[P , Set]$ , where $P$ is an ordered monoid (capturing cost), there's heaps of structure about, such as being a doubly closed monoidal category. Has anyone looked to combine such structure with the Markov structure in the diagram category? It seems one can represent expected costs in probabilistic programs.

Tobias Fritz (Jun 19 2025 at 11:37):

In what sense is [P, Set] doubly closed? Are you looking at two different monoidal structures? If so, which ones?

David Corfield (Jun 19 2025 at 12:58):

Say, $P$ is an ordered monoid considered as a thin monoidal category, then $[P, Set]$ has Day convolution and the cartesian product as the two monoidal products.

Pedro Amorim (Jun 20 2025 at 00:05):

I have done some work on denotational semantics for expected cost analysis: https://dl.acm.org/doi/10.1145/3720424

I haven't connected it to the Markov category literature, but what I can say is that the expected cost monad I have defined in Section 3.3 from the linked paper, is commutative (but not affine) in certain examples, so its Kleisli category will be CD.

I don't know if this relates to your functorial approach, but I hope that it piques your interest :)

David Corfield (Jun 20 2025 at 06:58):

Thanks, Pedro. I've taken note of your interesting paper.

This copresheaf model of ours was arrived at with Vineet Rajani and others at Kent by looking to make category-theoretic sense of

A unifying type-theory for higher-order (amortized) cost analysis [here | technical appendix]
With Marco Gaboardi, Deepak Garg and Jan Hoffmann
In Proceedings of the ACM on Programming Languages (POPL), 2021

and the probabilistic version

A modal type theory of expected cost in higher-order probabilistic programs [preprint |here]
With Gilles Barthe and Deepak Garg.

Without any CT being used in the design, it seems very natural to interpret things in terms of families of copresheaves over an ordered cost monoid. Something on this should appear soon.

Pedro Amorim (Jun 20 2025 at 17:20):

Something else that might be relevant to this copresheaf model of yours is this note by @Paolo Perrone and @Tobias Fritz

https://arxiv.org/abs/1809.10481

They show that under certain circumstances you can Kan extend a graded monad into a monad. In particular, you can recover my expected cost monad from a graded cost monad similar to the one Vineet et al. have used in their proababilistic cost paper above.

I really think that this note should be better known as this Kan extension construction is very clever and insightful!