Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: event: Categorical Probability and Statistics 2020 workshop

Topic: Tutorial: What is a probability monad? (Paolo Perrone)

Paolo Perrone (Jun 01 2020 at 19:30):

Hello all. Here's the first tutorial video, an elementary introduction to probability monads.
https://youtu.be/3Da88Tgw_rM
Feel free to ask any questions here :)

Tomáš Gonda (Jun 02 2020 at 09:21):

Thanks for the turorial Paolo, that was useful!

Paolo Perrone (Jun 02 2020 at 12:52):

Great to hear! :)

Evan Patterson (Jun 04 2020 at 04:38):

Thanks for this nice tutorial, Paolo. Super clear.

Paolo Perrone (Jun 04 2020 at 12:48):

Thank you!

Paolo Perrone (Jun 04 2020 at 19:34):

Arthur Parzygnat has created this nice video series, if some of you want to learn more about probability monads:
https://www.youtube.com/playlist?list=PLSx1kJDjrLRSKKHj4zetTZ45pVnGCRN80

Arthur Parzygnat (Jun 04 2020 at 21:02):

I really enjoyed your explanation of the example for the power set monad and how the Kleisli category is the category of relations. The explanation in your thesis was also very clear and I highly recommend others to check it out. I have a question about your last example on convolution (btw, I found Tao's blog https://terrytao.wordpress.com/2013/07/26/computing-convolutions-of-measures/ useful while thinking about this example), although it's very open-ended: what other constructions can you obtain by using $\nabla$ besides convolution of probability measures? If you take an internal monoid or an internal monoid action, it seems you get a slight generalization of the notion of convolution on $\R$ .

Paolo Perrone (Jun 04 2020 at 21:21):

Arthur Parzygnat said:

I really enjoyed your explanation of the example for the power set monad and how the Kleisli category is the category of relations. The explanation in your thesis was also very clear and I highly recommend others to check it out. I have a question about your last example on convolution (btw, I found Tao's blog https://terrytao.wordpress.com/2013/07/26/computing-convolutions-of-measures/ useful while thinking about this example), although it's very open-ended: what other constructions can you obtain by using $\nabla$ besides convolution of probability measures? If you take an internal monoid or an internal monoid action, it seems you get a slight generalization of the notion of convolution on $\R$ .

Thank you! So, indeed, a lax monoidal functor maps internal monoids to internal monoids - this generalizes what happens in $\R$ . Moreover, the fact that we get a monoidal monad, i.e. that the multiplication of the monoidal structure commutes with the monad structure maps, makes this monoid internal to the category of algebras over the monad. In other words, the convolution is "convex-bilinear", in our case.

Paolo Perrone (Jun 04 2020 at 21:23):

The tensor product in question, which takes care of this form of "bilinearity", is this one: https://ncatlab.org/nlab/show/tensor+product+of+algebras+over+a+commutative+monad

Paolo Perrone (Jun 04 2020 at 21:24):

I'm not aware of whether other algebraic structures (with nontrivial identities) are preserved besides monoids in general. Groups are not, for example.

Paolo Perrone (Jun 04 2020 at 21:43):

A slight generalization of monoids, namely graded monoids, where the grading is given by any (other) monoid or monoidal category, is also preserved, since those can be written as lax monoidal functors themselves.

JS PL (he/him) (Jun 05 2020 at 08:08):

Hey Paolo - I have a simple question. On the nlab page for probability monads it says: "Note that “probability monad” and “measure monad” are not used here as technical terms with a definition. Here we just explain what most monads of this kind looks like. (There are ongoing works which may give a general, precise definition of probability monad.)"

JS PL (he/him) (Jun 05 2020 at 08:10):

So my question is: has there been any progress in giving a precise definition of a "probability monad"? I recall you gave a discussion about this at a talk of yours in Canada a few years ago (I recall you referencing bimonoidal monads).

Arthur Parzygnat (Jun 05 2020 at 10:01):

As far as I am aware, we don't yet have a definition that reproduces all of these (plus others that Paolo did not mention in his talk). Nevertheless, I'm also curious about the specific progress. Maybe @Tobias Fritz might also have some ideas since he and collaborators are uncovering many interesting theorems from statistics using particular monads (also @Sam Tenka). I wonder if certain theorems about statistics can be used in reverse to classify the probabilistic monads and distinguish them from others?

Paolo Perrone (Jun 05 2020 at 10:51):

Yes to both: there is ongoing joint work of Tobias, Eigil Rischel, Tomáš Gonda and myself, where we give the link between probability monads and Markov categories, and a definition of probability monad comes alongside. We are still working on it though.

JS PL (he/him) (Jun 05 2020 at 13:20):

Here is somewhat a more naive question: are there any interesting probability COmonads?

JS PL (he/him) (Jun 05 2020 at 13:21):

Paolo Perrone said:

Yes to both: there is ongoing joint work of Tobias, Eigil Rischel, Tomáš Gonda and myself, where we give the link between probability monads and Markov categories, and a definition of probability monad comes alongside. We are still working on it though.

Cool - I'll keep an eye out for this definition.

Paolo Perrone (Jun 05 2020 at 14:10):

JS Pacaud Lemay said:

Here is somewhat a more naive question: are there any interesting probability COmonads?

Sure, take the opposite category of Meas... ;p
No, seriously, there are comonads of interest in probability theory. As far as I know the usual probability monads don't admit a comonad structure as well. However, on the other side of the Eilenberg-Moore and Kleisli adjunctions, one does get interesting comonads. For Eilenberg-Moore, one gets a comonad on "convex spaces" which maps a convex space to a "free presentation" of it, which in the case of probability monads gives all the probability measures that have a certain expectation value (one can view them as "instructions to get to that point", with the whole process given by the bar construction).
For the Kleisli case, I'm working on it, but I still don't understand very well what's going on, so I'll stop before I say something stupid.
In addition, there are other comonads, on the base category, which are of use in probability, most of all the stream comonad. Coalgebras are dynamical systems, and one can combine it with a probability monad to talk about stochastic processes.
(Do you have a particular construction in mind?)

Tobias Fritz (Jun 05 2020 at 14:16):

That's a great question, and it may well be one of the guiding questions to keep in mind over the next few days. Let me share some more detailed thoughts, and everyone please feel free to share your perspective as well or comment on what has already been said.

It's good to keep in mind what we would like a definition of probability monad to do for us. My perspective on this is that we'd like to do synthetic probability theory with it, as Sam had already mentioned briefly. This means that we'd like to state and prove theorems on probability and statistics in purely categorical terms. One important benefit is then that those theorems become automatically vastly more general than in their traditional formulation, and for example we can hope to instantiate them on random graphs as in Sam's talk. So this is one goal of what a definition of probability monad may do for us.

With this in mind, it may well be the case that there is no single notion of probability monad. Stating and proving different theorems may require different categorical axioms of various degrees of strength.

That being said, for probability theory one at least will want to have a commutative monad (or equivalently symmetric monoidal monad). It's also useful to have the monad to commute with filtered limits, or perhaps only with countable filtered limits, for the purposes of implementing the Kolmogorov extension theorem; since in probability theory, we often want to talk about infinitely many random variables at the same time, and thus one needs to consider distributions on infinite products.

There are other candidate axioms that seem to be useful, such as the possibility to do conditioning. I know how to formulate these in terms of the Kleisli category of the monad, but as far as I'm aware they haven't been translated into axioms on the monad yet. I can expand on this if anyone is interested, but for now this comment is already pretty long.

Peter Arndt (Jun 05 2020 at 14:34):

I am _very_ interested - expand all you can, please, when you find the time!

Tobias Fritz (Jun 05 2020 at 14:52):

To talk about the existence of conditionals, it's convenient to introduce a bit of notation. The following is from my paper on Markov categories, but goes back to Disintegration and Bayesian Inversion via String Diagrams by Cho and Jacobs in slightly less general form.

First, it's helpful to introduce a bit of notation. Let me try $f(x|a)$ for any morphism $f : A \to X$ in a category which "looks like" the Kleisli category of a probability monad, i.e. so that we can think of $f$ as a Markov kernels. A bit as with abstract index notation, $a$ and $x$ are just placeholders which remind us that the domain and codomain of $f$ are $A$ and $X$$, respectively. Similarly, $f(x,y|a,b)$ denotes a morphism $f : A \otimes B \to X \otimes Y)$ . We can now denote composition of $f : A \to B$ and $g : B \to C$ as a formal version of the Chapman-Kolmogorov equation,

$(g\circ f)(c|a) := \sum_b g(c|b) f(b|a).$

Similarly, parallel composition (monoidal product) of morphisms is simply given by juxtaposition.

Now I can almost state the definition of conditioning, which is pretty much the usual one, just reinterpreted in terms of this formal notation. But more on this after the session :)

Tobias Fritz (Jun 05 2020 at 15:44):

So now if $f(x,y|a)$ is any morphism $A \to X \otimes Y$ , then a conditional of $f$ on $X$ is any morphism $f_{|X}(y|x,a)$ , meaning $f_{|X} : X \otimes A \to Y$ , is any morphism which satisfies the equation

$f(x,y|a) = f(y|x,a) f(x|a).$

Note that this is equivalent to the usual definition of conditional modulo just dividing by the $f(x|a)$ term, and the usual definition is also typically stated only for $A = I$ being the monoidal unit, so that the $a$ input does not appear.

Here, I'm actually using that our Kleisli category has a bit more structure than just symmetric monoidal: every object also comes equipped with a distinguished comonoid structure. In particular, by composing with the counit of $Y$ , we can turn $f(x,y|a)$ into its marginal $f(x|a)$ . The right-hand side also makes two references to $a$ ; this corresponds categorically to an application of the comultiplication $A \to A \otimes A$ , which can be thought of intuitively as copying.

Tobias Fritz (Jun 05 2020 at 15:46):

In terms of string diagrams, that defining equation looks like this:
conditional.png

JS PL (he/him) (Jun 05 2020 at 17:40):

Paolo Perrone - my next question was going to be if any of the probabilty monads also have a comonad structure. But I had not specific example in mind, simply curious!

JS PL (he/him) (Jun 05 2020 at 17:43):

When you take the comonad of the Eilenberg-Moore category of these probability monads, and then take the coEilenberg-Moore category of these comonads, do you get back the base category you started with? (aka are probability monads of (effective) descent type?)

JS PL (he/him) (Jun 05 2020 at 17:48):

Tobias Fritz thanks for your answer! Hopefully it'll turn out that there is a notion of probability monad. So that we can "synethetic probability theory" in strange new interesting examples!

Peter Arndt (Jun 05 2020 at 17:54):

Thanks a lot - this is intriguing!

Paolo Perrone (Jun 05 2020 at 18:02):

JS Pacaud Lemay said:

Paolo Perrone - my next question was going to be if any of the probabilty monads also have a comonad structure. But I had not specific example in mind, simply curious!

If the question is whether the endofunctor of a probability monad has also a comonad structure, on the same base category, I'd say the answer is no, or at least, not in a way that can be interpret as having a probabilistic meaning. The reason is that there doesn't seem to be a canonical counit map, in general, a (deterministic) map $P X \to X$ where $X$ is a generic space (measurable, topological, just a set,...).

Paolo Perrone (Jun 05 2020 at 18:03):

JS Pacaud Lemay said:

When you take the comonad of the Eilenberg-Moore category of these probability monads, and then take the coEilenberg-Moore category of these comonads, do you get back the base category you started with? (aka are probability monads of (effective) descent type?)

Good question! I've never looked into this. Has anyone here tried?

JS PL (he/him) (Jun 05 2020 at 20:34):

Ah I see... not having a counit makes sense. Thanks for the answers!

JS PL (he/him) (Jun 05 2020 at 20:37):

I went back and look at a paper I really like of Bart Jacobs: https://arxiv.org/pdf/1309.0844.pdf
Turns out one his examples is the discrete probabilty monad. I like that paper a lot because it nicely shows that these coalgebras have a sort of "basis". Maybe most probability monads are of effective descent type (now why is that interesting? I don't know!)

Nathaniel Virgo (Jun 06 2020 at 09:19):

Paolo Perrone said:

However, on the other side of the Eilenberg-Moore and Kleisli adjunctions, one does get interesting comonads. For Eilenberg-Moore, one gets a comonad on "convex spaces" which maps a convex space to a "free presentation" of it, which in the case of probability monads gives all the probability measures that have a certain expectation value (one can view them as "instructions to get to that point", with the whole process given by the bar construction).

Is there somewhere one can read about this? It sounds like a piece in the "how to do information geometry with category theory" puzzle.

Paolo Perrone (Jun 06 2020 at 11:13):

Uhmm. One place could be my MFPS talk, https://youtu.be/28EASeG1RBA
A lot of work on these matters has been done (not by me) for the case of operads, as far as I know we were the first to look at the case of probability monads.

Valeria de Paiva (Jun 07 2020 at 13:21):

Paolo Perrone said:

Hello all. Here's the first tutorial video, an elementary introduction to probability monads.
https://youtu.be/3Da88Tgw_rM
Feel free to ask any questions here :)

hi Paolo, have the slides for the tutorials been posted somewhere?

Paolo Perrone (Jun 07 2020 at 14:00):

Valeria de Paiva said:

Paolo Perrone said:

Hello all. Here's the first tutorial video, an elementary introduction to probability monads.
https://youtu.be/3Da88Tgw_rM
Feel free to ask any questions here :)

hi Paolo, have the slides for the tutorials been posted somewhere?

Right. Not yet! Let me give them to Tobias.