Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: theory: categorical probability

Topic: conditionals


view this post on Zulip Richard Samuelson (Aug 19 2021 at 04:16):

I have been thinking about how to characterize conditionals categorically. Here is a formulation different (I think) than @Tobias Fritz's .

Let Θ\Theta be a category of the form

Fig1.jpg

in which objects are standard measurable spaces with the same sample space, and morphisms are identity maps from finer spaces to coarser ones.

Let P:Δ1Θ\mathbf{P}: \Delta 1 \rightarrow \Theta be a cone from the singleton set to Θ\Theta.

Fig2.jpg

Then there exists another category Θ\Theta^* with the same objects as Θ\Theta and morphisms pointing the other way, along with a contravariant cone P:Δ1Θ\mathbf{P}^*: \Delta 1 \rightarrow \Theta^* such that the following diagram commutes:

Fig3.jpg

where δ:ΘΘ\delta: \Theta^* \Rightarrow \Theta is the natural transformation in which, for all i0i \geq 0, δΩi:ΩiΩi\delta_{\Omega_i}: \Omega_i \rightarrow \Omega_i is the identity map.

Fig4.1.jpg

I think that this universal property entirely characterizes conditionals.

view this post on Zulip Tobias Fritz (Aug 19 2021 at 07:28):

Nice idea! I need to think about it a little more and need to go soon, but for now let me just comment that your terminology and notation is a bit non-standard, which may impede communication with other category theorists. Specifically:

This is not to criticize the idea, which I think goes in the right direction. (More on that later.) It's normal that your parlance will be a bit non-standard at the beginning, especially if you haven't talked to category theorists much before. I'm trying to point it out early so that it won't perpetuate.

view this post on Zulip Tobias Fritz (Aug 19 2021 at 13:54):

Now how about that definition of conditionals itself? Let me first try to formulate it in my own words, and then see whether this is indeed what you have in mind.

Let Meas\mathsf{Meas} be the category of measurable spaces and measurable maps, and let Stoch\mathsf{Stoch} be the Kleisli category of the Giry monad on Meas\mathsf{Meas}, so that its morphisms are Markov kernels between measurable spaces. Then we have a canonical functor MeasStoch\mathsf{Meas} \to \mathsf{Stoch}. It turns every measurable map into the corresponding (deterministic) Markov kernel. It's worth noting that this functor is not faithful.

Suppose that we have f:Ω1Ω0f : \Omega_1 \to \Omega_0 in Meas\mathsf{Meas}. Denoting the singleton measurable space by 11, a morphism P0:1Ω1\mathbf{P}_0 : 1 \to \Omega_1 in Stoch\mathsf{Stoch} is exactly a probability measure, so that (Ω0,P0)(\Omega_0,\mathbf{P}_0) is a probability space. The pushforward of this measure is the composite 1P0Ω1fΩ01 \stackrel{\mathbf{P}_0}{\to} \Omega_1 \stackrel{f}{\to} \Omega_0 in Stoch\mathsf{Stoch}. If we denote it by P1\mathbf{P}_1, then we can consider both measures as the two components of a cone from 11 to the diagram Ω1Ω0\Omega_1 \to \Omega_0 in Stoch\mathsf{Stoch}.

Now a conditional is a section α:Ω0Ω1\alpha : \Omega_0 \to \Omega_1 of ff in Stoch\mathsf{Stoch}, which means that fα=idf \circ \alpha = \mathrm{id}, and in addition such that αP0=P1\alpha \circ \mathbf{P}_0 = \mathbf{P}_1. In other words, α\alpha must be a section such that P\mathbf{P} also forms a cone with respect to the diagram Ω0αΩ1\Omega_0 \stackrel{\alpha}{\to} \Omega_1.

Finally, it's worth noting that in some situations, such as in the theory of stochastic processes, we have more than just two measurable spaces, but rather a whole diagram, like Ω2Ω1Ω0\ldots \to \Omega_2 \to \Omega_1 \to \Omega_0. In this case, the definition of conditional can be generalized straightforwardly by using the formulation in terms of sections and cones.

Is this an accurate representation of what you have in mind?

view this post on Zulip Richard Samuelson (Aug 20 2021 at 20:56):

Thank you for the considered response. That is indeed what I had in mind.

Θ\Theta, Θ\Theta^*, and Δ1\Delta 1 are actually functors from N{}\mathbf{N} \cup \{\infty\} into Stoch\mathsf{Stoch}; and P\mathbf{P} is a natural transformation.

It seems to me that there should be such a thing as a "contravariant" natural transformation, from a functor to a contravariant functor. Then P\mathbf{P}^* and δ\delta would be just such a thing.


Here is something interesting:

If Ω0\Omega_0 and Ω1\Omega_1 are objects in Stoch\mathsf{Stoch} and i:Ω1Ω0i: \Omega_1 \rightarrow \Omega_0 is the pushforward of the identity map, then Ω1\Omega_1 must have the same sample space as Ω0\Omega_0 and a finer σ\sigma-algebra.

As you mentioned above, any measure P:1Ω1\mathbf{P}: 1 \rightarrow \Omega_1 can be turned into a cone:

Fig2.1.jpg

A conditional expectation K:Ω0Ω1K: \Omega_0 \rightarrow \Omega_1 then makes the following diagram commute:

Fig2.2.jpg

Where Δ\Delta is the pushforward of the copy map x(x,x)x \mapsto (x, x), and δ:Ω0Ω0\delta: \Omega_0 \rightarrow \Omega_0 is the identity morphism.

The equality

(Kδ)Δ=ΔK(K \otimes \delta) \circ \Delta = \Delta \circ K

is the "conditional determinism" property of conditional expectations, and the equality

(Kδ)ΔP0=ΔP1(K \otimes \delta) \circ \Delta \circ \mathbf{P}_0 = \Delta \circ \mathbf{P}_1

is the projection property.

view this post on Zulip Tobias Fritz (Aug 21 2021 at 15:22):

You're very welcome! It would be nice if we had some others chime in as well.

Concerning natural transformation from a covariant functor to a contravariant one, and things like that the important thing is to explain what you mean for any terminology that isn't standard (not textbook material). So by contravariant natural transformation, I assume that you mean this, but it's still a bit of a guess.

I can't quite parse your second diagram. What is the map Δ:Ω1Ω1Ω0\Delta : \Omega_1 \to \Omega_1 \otimes \Omega_0?

What you have in mind may be this diagram? While this not a commutative diagram because the pentagon does not commute, it is a partially commutative diagram in the sense that it does commute when you start at Δ1\Delta 1. And this is precisely the definition of conditioning (in the form of Bayesian inversion) given by Cho and Jacobs, as in their (5) on p.10. (They write σ\sigma instead of P\mathbf{P} and cc instead of ii and dd instead of KK.)

(One more silly terminology thing: I'm pretty sure that you don't intend KK to be the conditional expectation, since that would be a map from functions on one space to functions on the other. What you seem to be referring to is called a regular conditional probability.)