Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: theory: categorical probability

Topic: conditionals

Richard Samuelson (Aug 19 2021 at 04:16):

I have been thinking about how to characterize conditionals categorically. Here is a formulation different (I think) than @Tobias Fritz's .

Let $\Theta$ be a category of the form

Fig1.jpg

in which objects are standard measurable spaces with the same sample space, and morphisms are identity maps from finer spaces to coarser ones.

Let $\mathbf{P}: \Delta 1 \rightarrow \Theta$ be a cone from the singleton set to $\Theta$ .

Fig2.jpg

Then there exists another category $\Theta^*$ with the same objects as $\Theta$ and morphisms pointing the other way, along with a contravariant cone $\mathbf{P}^*: \Delta 1 \rightarrow \Theta^*$ such that the following diagram commutes:

Fig3.jpg

where $\delta: \Theta^* \Rightarrow \Theta$ is the natural transformation in which, for all $i \geq 0$ , $\delta_{\Omega_i}: \Omega_i \rightarrow \Omega_i$ is the identity map.

Fig4.1.jpg

I think that this universal property entirely characterizes conditionals.

Tobias Fritz (Aug 19 2021 at 07:28):

Nice idea! I need to think about it a little more and need to go soon, but for now let me just comment that your terminology and notation is a bit non-standard, which may impede communication with other category theorists. Specifically:

There is no such thing as a contravariant cone. Isn't your $\mathbf{P}^*$ just a cone in the usual sense? (It's clearly not a cocone!)
You first say that $\Theta$ is a category, but then you use notation implying that $\Theta$ is a functor, since it's the codomain of a natural transformation.
Your $\delta$ does can't actually be a natural transformation, since it seems to go from a contravariant functor to a covariant one. For a natural transformation, the domain and codomain must be functors of the same type.
Most importantly, you give an equational condition for a bunch of morphisms and call it a "universal property". This is not what we mean by a universal property. A universal property is what you have when you characterize an object by saying how to map into it [or out of it] from any other object. For example, "Giving a map into a product $A \times B$ amounts to giving a map into $A$ and a map into $B$ ".

This is not to criticize the idea, which I think goes in the right direction. (More on that later.) It's normal that your parlance will be a bit non-standard at the beginning, especially if you haven't talked to category theorists much before. I'm trying to point it out early so that it won't perpetuate.

Tobias Fritz (Aug 19 2021 at 13:54):

Now how about that definition of conditionals itself? Let me first try to formulate it in my own words, and then see whether this is indeed what you have in mind.

Let $\mathsf{Meas}$ be the category of measurable spaces and measurable maps, and let $\mathsf{Stoch}$ be the Kleisli category of the Giry monad on $\mathsf{Meas}$ , so that its morphisms are Markov kernels between measurable spaces. Then we have a canonical functor $\mathsf{Meas} \to \mathsf{Stoch}$ . It turns every measurable map into the corresponding (deterministic) Markov kernel. It's worth noting that this functor is not faithful.

Suppose that we have $f : \Omega_1 \to \Omega_0$ in $\mathsf{Meas}$ . Denoting the singleton measurable space by $1$ , a morphism $\mathbf{P}_0 : 1 \to \Omega_1$ in $\mathsf{Stoch}$ is exactly a probability measure, so that $(\Omega_0,\mathbf{P}_0)$ is a probability space. The pushforward of this measure is the composite $1 \stackrel{\mathbf{P}_0}{\to} \Omega_1 \stackrel{f}{\to} \Omega_0$ in $\mathsf{Stoch}$ . If we denote it by $\mathbf{P}_1$ , then we can consider both measures as the two components of a cone from $1$ to the diagram $\Omega_1 \to \Omega_0$ in $\mathsf{Stoch}$ .

Now a conditional is a section $\alpha : \Omega_0 \to \Omega_1$ of $f$ in $\mathsf{Stoch}$ , which means that $f \circ \alpha = \mathrm{id}$ , and in addition such that $\alpha \circ \mathbf{P}_0 = \mathbf{P}_1$ . In other words, $\alpha$ must be a section such that $\mathbf{P}$ also forms a cone with respect to the diagram $\Omega_0 \stackrel{\alpha}{\to} \Omega_1$ .

Finally, it's worth noting that in some situations, such as in the theory of stochastic processes, we have more than just two measurable spaces, but rather a whole diagram, like $\ldots \to \Omega_2 \to \Omega_1 \to \Omega_0$ . In this case, the definition of conditional can be generalized straightforwardly by using the formulation in terms of sections and cones.

Is this an accurate representation of what you have in mind?

Richard Samuelson (Aug 20 2021 at 20:56):

Thank you for the considered response. That is indeed what I had in mind.

$\Theta$ , $\Theta^*$ , and $\Delta 1$ are actually functors from $\mathbf{N} \cup \{\infty\}$ into $\mathsf{Stoch}$ ; and $\mathbf{P}$ is a natural transformation.

It seems to me that there should be such a thing as a "contravariant" natural transformation, from a functor to a contravariant functor. Then $\mathbf{P}^*$ and $\delta$ would be just such a thing.

Here is something interesting:

If $\Omega_0$ and $\Omega_1$ are objects in $\mathsf{Stoch}$ and $i: \Omega_1 \rightarrow \Omega_0$ is the pushforward of the identity map, then $\Omega_1$ must have the same sample space as $\Omega_0$ and a finer $\sigma$ -algebra.

As you mentioned above, any measure $\mathbf{P}: 1 \rightarrow \Omega_1$ can be turned into a cone:

Fig2.1.jpg

A conditional expectation $K: \Omega_0 \rightarrow \Omega_1$ then makes the following diagram commute:

Fig2.2.jpg

Where $\Delta$ is the pushforward of the copy map $x \mapsto (x, x)$ , and $\delta: \Omega_0 \rightarrow \Omega_0$ is the identity morphism.

The equality

$(K \otimes \delta) \circ \Delta = \Delta \circ K$

is the "conditional determinism" property of conditional expectations, and the equality

$(K \otimes \delta) \circ \Delta \circ \mathbf{P}_0 = \Delta \circ \mathbf{P}_1$

is the projection property.

Tobias Fritz (Aug 21 2021 at 15:22):

You're very welcome! It would be nice if we had some others chime in as well.

Concerning natural transformation from a covariant functor to a contravariant one, and things like that the important thing is to explain what you mean for any terminology that isn't standard (not textbook material). So by contravariant natural transformation, I assume that you mean this, but it's still a bit of a guess.

I can't quite parse your second diagram. What is the map $\Delta : \Omega_1 \to \Omega_1 \otimes \Omega_0$ ?

What you have in mind may be this diagram? While this not a commutative diagram because the pentagon does not commute, it is a partially commutative diagram in the sense that it does commute when you start at $\Delta 1$ . And this is precisely the definition of conditioning (in the form of Bayesian inversion) given by Cho and Jacobs, as in their (5) on p.10. (They write $\sigma$ instead of $\mathbf{P}$ and $c$ instead of $i$ and $d$ instead of $K$ .)

(One more silly terminology thing: I'm pretty sure that you don't intend $K$ to be the conditional expectation, since that would be a map from functions on one space to functions on the other. What you seem to be referring to is called a regular conditional probability.)