Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: theory: categorical probability

Topic: Unpacking Stoch

Nathaniel Virgo (Feb 25 2023 at 11:23):

I'm trying to unpack the definition of Stoch (the Kleisli category of the Giry monad) as a Markov category. My goal is to be able to take any string diagram and be able to express it explicitly in terms of measure-theoretic integrals, in order to prove properties of Stoch. I don't know much measure theory and am trying to learn what I need as I go along.

So far the best resource I've found for unpacking Stoch is Panangaden's 'The Category of Markov Kernels' (1999), which is really helpful but stops short of completely answering my question. The tricky part is in the details of how to reason about the interaction between the monoidal product and composition, as I explain below.

Here is what I know so far:

Objects are measurable spaces, which I'll write as $(X,\Sigma_X)$ . I'll write $\sigma_X, \tau_X$ etc. for elements of $\Sigma_X$ .

Then a morphism $f\colon X\to Y$ is a map $f\colon \Sigma_Y\times X\to [0,1]$ , where we write $f(\sigma_Y\mid x)$ for $f(\sigma_Y,x)$ . This must have the properties that

$f({-}\mid x)\colon \Sigma_Y\to [0,1]$ is a probability measure for every $x\in X$
$f(B\mid{-})\colon X\to [0,1]$ is a measurable function for every $B\in\Sigma_Y$ .

Composition $X\xrightarrow{f}Y\xrightarrow{g}Z$ is given by

$(f;g)(\sigma_Z\mid x) = \int_{y\in Y} g(\sigma_Z\mid y) \,d\, f(y\mid x).$

There is some work needed to see that this function has the needed properties and that composition is associative. Panangaden gives an explicit proof of associativity. Identities are Dirac deltas.

The monoidal product is given on objects by $(X\times Y, \Sigma_X\otimes \Sigma_Y)$ , where $\Sigma_X\otimes \Sigma_Y$ is the product $\sigma$ -algebra. For morphisms $f\colon A\to X$ and $g\colon B\to Y$ we define

$(f\otimes g)(\sigma_{X}\times \sigma_{Y}\mid a,b) = f(\sigma_{X}\mid a)g(\sigma_{Y}\mid b).$

This defines the measure only on elements of $\Sigma_X\otimes \Sigma_Y$ that are of the form $\sigma_{X}\times \sigma_{Y}$ , but this uniquely extends to a measure on $\Sigma_X\otimes \Sigma_Y$ by Carathéodory's extension theorem.

So far so good. The tricky part is in figuring out how composition and this monoidal product relate to each other.

For example, suppose I have morphisms $f\colon A\to X$ , $g\colon B\to Y$ and $h\colon X\otimes Y\to Z$ , and I want to calculate $(f\otimes g) \mathbin{⨾} h$ . The morphism $f\otimes g$ is as defined above, but in order to compose it with $h$ I need to be able to integrate against it - I can't see a way to write $(f\otimes g) \mathbin{⨾} h$ down as a single equation because of the step where we extend the partially defined measure to a full one.

Ultimately I want to be able to prove that certain morphisms are equal under certain conditions, and this sort of thing is causing a roadblock to that. So I guess my questions are

1) can the morphism $(f\otimes g) \mathbin{⨾} h$ be written down explicitly in a nice way?

2) is there somewhere where this explicit unpacking of Stoch is done and written down already?

If these questions get answers I might have more questions as well, but let's see how this goes.

John Baez (Feb 25 2023 at 17:24):

It can't hurt to email Prakash: he's a friendly guy and he'd be happy to know someone is interested in this.

Richard Samuelson (Feb 25 2023 at 17:32):

You could write it down as an integral. For all measurable

$p: Z \to [0, \infty]$

we have

$\int ((f \otimes g); h)(a, b, \text{d}z) p(z) = \int f(a, \text{d}x) \int g(b, \text{d}y) \int h(x, y, \text{d}z) \, p(z).$

Hence,

$((f \otimes g); h)(a, b, \text{d}z) = f(a, \text{d}x) g(b, \text{d}y) h(x, y, \text{d}z)$

using the notation in https://link.springer.com/book/10.1007/978-0-387-87859-1

Tobias Fritz (Feb 25 2023 at 17:59):

Sorry if this is a bit brief, but I think that the explicit formula should simply be

$((f \otimes g) ; h)(S|a,b) = \int_{x \in X} \int_{y \in Y} h(S|x,y) \, df(x|a) \, dg(y|b).$

This is basically the same equation as calculating the expectation value of a measurable function with respect to a product measure. BTW I personally prefer writing this as

$(f \otimes g) ; h)(S|a,b) = \int_{x \in X} \int_{y \in Y} h(S|x,y) \, f(dx|a) \, g(dy|b),$

with the $d$ 's at the integration variables, which seems more intuitive.

Nathaniel Virgo (Feb 26 2023 at 01:03):

Tobias Fritz said:

Sorry if this is a bit brief, but I think that the explicit formula should simply be

$((f \otimes g) ; h)(S|a,b) = \int_{x \in X} \int_{y \in Y} h(S|x,y) \, df(x|a) \, dg(y|b).$

This is basically the same equation as calculating the expectation value of a measurable function with respect to a product measure.

Thank you, I'm sure this is right. (And brevity is good.) I was confused because I was actually trying to evaluate $q;(f\otimes g);h$ for some $q\colon 1\to X\times Y$ , and I was stuck at trying to integrate $h(S\mid x,y)$ against $q;(f\otimes g)$ . The latter is a measure defined on a product space but it isn't a product measure, so I couldn't do this.

But your comment made me realise I can just evaluate it as $q;((f\otimes g);h)$ instead and then there's no problem. Maybe this will always work - I can always just evaluate things from back to front and then I'll always either be integrating against a product measure or against a measure that's fully defined instead of partially defined and then extended. I'll try it on my real problem and see if it works.

BTW I personally prefer writing this as

$(f \otimes g) ; h)(S|a,b) = \int_{x \in X} \int_{y \in Y} h(S|x,y) \, f(dx|a) \, g(dy|b),$

with the $d$ 's at the integration variables, which seems more intuitive.

I write it the way I do because I think of $\int_{x\in X}f(x)\,d\,p(x)$ as a souped-up $\langle p, f\rangle$ (applying a 'dual vector' to a 'vector'), where the $x$ is a dummy variable and the $d$ is really just the comma. But I can see the point of the other notation and will give it a try - it might be less confusing in the end.

Nathaniel Virgo (Feb 26 2023 at 01:56):

Richard Samuelson said:

You could write it down as an integral. For all measurable

$p: Z \to [0, \infty]$

we have

$\int ((f \otimes g); h)(a, b, \text{d}z) p(z) = \int h(x, y, \text{d}z) \int f(a, \text{d}x) \int g(b, \text{d}y) \, p(z).$

Hence,

$((f \otimes g); h)(a, b, \text{d}z) = h(x, y, \text{d}z) f(a, \text{d}x) g(b, \text{d}y)$

using the notation in https://link.springer.com/book/10.1007/978-0-387-87859-1

This makes me start to like the $dx$ notation actually - I like being able to write it implicitly like this. But I think we can only get rid of one of the integral signs, because we still have to integrate over $x$ and $y$ . So I'd write this as

$((f \otimes g); h)(dz\mid a, b) = \int_{y\in Y}\int_{x\in X}h(dz\mid x, y) f(dx\mid a) g(dy\mid b),$

which is basically what Tobias Fritz wrote only with $dz$ in place of $S\in\Sigma_Z$ .

Nathaniel Virgo (Feb 26 2023 at 03:10):

I have one more question, which is as much a notation question as anything else. Suppose I have $f\colon 1\to X\otimes Y$ and $g\colon X\otimes Y\to Z$ . What's the right way to write down their composite?

It feels like I should be able to write it as

$(f;g)(dz\mid \bullet) = \int_{x,y\in X\otimes Y} g(dz\mid x, y) f(dx, dy\mid\bullet),$

but I'm a bit uneasy about writing $dx$ and $dy$ , because the space being integrated over is $X\otimes Y$ , not $X$ and $Y$ separately. Is there a formal argument that allows it to be written this way, or should it really be something like

$(f;g)(dz\mid \bullet) = \int_{x,y\in X\otimes Y} g(dz\mid x, y) f(d(x,y)\mid\bullet)$

instead?

Tobias Fritz (Feb 26 2023 at 07:55):

Without a formal syntax in place, I'd say that both of those notations make sense since they express what you mean clearly enough. You could also write it as a double integral (by Fubini's theorem), and/or write $dx \times dy$ in the argument of $f$ in order to express the idea that one sums up the measures of tiny rectangles.

Nathaniel Virgo (Feb 26 2023 at 08:32):

I guess the semantics of these things is what I'm trying to grasp towards. Or at least I want to be sure I know what I mean when I write these things down.

Can I really use Fubini's theorem here? I thought I could only use that in the case where I'm integrating over a product measure. If I write

$\int_{x\in X} \int_{y\in Y} g(S\mid x,y) f(dx, dy \mid \bullet)$

then it looks to me like I need to evaluate something like

$\int_{y\in Y} g(S\mid x,y) f(T, dy \mid \bullet)$

and then integrate that over $X$ .

If that's right then what's the name of this kind of partial integral? I'm not getting much luck trying to google things like "integrating over one variable of a joint measure".

The same thing comes up in this situation: suppose I have $q\colon A\otimes X$ and $f\colon X\to Y$ , and I want to calculate $q;(id_A;f)$ . I end up with

$(q;(id_A;f))(da,dy\mid \bullet) = \int_{a', x\in A\otimes X} f(dy\mid x) id_A(da\mid a') \,\, q(da,dx'\mid \bullet).$

It seems like I should be able to integrate out $a'$ to get

$(q;(id_A;f))(da,dy\mid \bullet) = \int_{x\in X} f(dy\mid x) \,\, q(da,dx\mid \bullet).$

but then I'm back to integrating only over $X$ even though $q$ is a measure over $A\otimes X$ , and I don't know what it means formally.

Tobias Fritz (Feb 26 2023 at 09:11):

Urgh, sorry, I'm a bit out of it today -- please ignore my reference to Fubini, you're right that it's not that.

Richard Samuelson (Feb 26 2023 at 19:09):

Nathaniel Virgo said:

Richard Samuelson said:

You could write it down as an integral. For all measurable

$p: Z \to [0, \infty]$

we have

$\int ((f \otimes g); h)(a, b, \text{d}z) p(z) = \int h(x, y, \text{d}z) \int f(a, \text{d}x) \int g(b, \text{d}y) \, p(z).$

Hence,

$((f \otimes g); h)(a, b, \text{d}z) = h(x, y, \text{d}z) f(a, \text{d}x) g(b, \text{d}y)$

using the notation in https://link.springer.com/book/10.1007/978-0-387-87859-1

This makes me start to like the $dx$ notation actually - I like being able to write it implicitly like this. But I think we can only get rid of one of the integral signs, because we still have to integrate over $x$ and $y$ . So I'd write this as

$((f \otimes g); h)(dz\mid a, b) = \int_{y\in Y}\int_{x\in X}h(dz\mid x, y) f(dx\mid a) g(dy\mid b),$

which is basically what Tobias Fritz wrote only with $dz$ in place of $S\in\Sigma_Z$ .

My original post was incorrect (I just edited it). I believe that the expression can be written

$((f \otimes g); h)(a, b, \text{d}z) = f(a, \text{d}x) g(b, \text{d}y) h(x, y, \text{d}z),$

without integral signs, and where $h$ is to the right of $f$ and $g$ . For example, see page 44 of Cinlar, Probability and Stochastics:

cinlar.png