Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: theory: categorical probability

Topic: Unpacking Stoch


view this post on Zulip Nathaniel Virgo (Feb 25 2023 at 11:23):

I'm trying to unpack the definition of Stoch (the Kleisli category of the Giry monad) as a Markov category. My goal is to be able to take any string diagram and be able to express it explicitly in terms of measure-theoretic integrals, in order to prove properties of Stoch. I don't know much measure theory and am trying to learn what I need as I go along.

So far the best resource I've found for unpacking Stoch is Panangaden's 'The Category of Markov Kernels' (1999), which is really helpful but stops short of completely answering my question. The tricky part is in the details of how to reason about the interaction between the monoidal product and composition, as I explain below.

Here is what I know so far:

Objects are measurable spaces, which I'll write as (X,ΣX)(X,\Sigma_X). I'll write σX,τX\sigma_X, \tau_X etc. for elements of ΣX\Sigma_X.

Then a morphism f ⁣:XYf\colon X\to Y is a map f ⁣:ΣY×X[0,1]f\colon \Sigma_Y\times X\to [0,1], where we write f(σYx)f(\sigma_Y\mid x) for f(σY,x)f(\sigma_Y,x). This must have the properties that

Composition XfYgZX\xrightarrow{f}Y\xrightarrow{g}Z is given by

(f;g)(σZx)=yYg(σZy)df(yx). (f;g)(\sigma_Z\mid x) = \int_{y\in Y} g(\sigma_Z\mid y) \,d\, f(y\mid x).

There is some work needed to see that this function has the needed properties and that composition is associative. Panangaden gives an explicit proof of associativity. Identities are Dirac deltas.

The monoidal product is given on objects by (X×Y,ΣXΣY)(X\times Y, \Sigma_X\otimes \Sigma_Y), where ΣXΣY\Sigma_X\otimes \Sigma_Y is the product σ\sigma-algebra. For morphisms f ⁣:AXf\colon A\to X and g ⁣:BYg\colon B\to Y we define

(fg)(σX×σYa,b)=f(σXa)g(σYb). (f\otimes g)(\sigma_{X}\times \sigma_{Y}\mid a,b) = f(\sigma_{X}\mid a)g(\sigma_{Y}\mid b).

This defines the measure only on elements of ΣXΣY\Sigma_X\otimes \Sigma_Y that are of the form σX×σY\sigma_{X}\times \sigma_{Y}, but this uniquely extends to a measure on ΣXΣY\Sigma_X\otimes \Sigma_Y by Carathéodory's extension theorem.

So far so good. The tricky part is in figuring out how composition and this monoidal product relate to each other.

For example, suppose I have morphisms f ⁣:AXf\colon A\to X, g ⁣:BYg\colon B\to Y and h ⁣:XYZh\colon X\otimes Y\to Z, and I want to calculate (fg)h(f\otimes g) \mathbin{⨾} h. The morphism fgf\otimes g is as defined above, but in order to compose it with hh I need to be able to integrate against it - I can't see a way to write (fg)h(f\otimes g) \mathbin{⨾} h down as a single equation because of the step where we extend the partially defined measure to a full one.

Ultimately I want to be able to prove that certain morphisms are equal under certain conditions, and this sort of thing is causing a roadblock to that. So I guess my questions are

1) can the morphism (fg)h(f\otimes g) \mathbin{⨾} h be written down explicitly in a nice way?

2) is there somewhere where this explicit unpacking of Stoch is done and written down already?

If these questions get answers I might have more questions as well, but let's see how this goes.

view this post on Zulip John Baez (Feb 25 2023 at 17:24):

It can't hurt to email Prakash: he's a friendly guy and he'd be happy to know someone is interested in this.

view this post on Zulip Richard Samuelson (Feb 25 2023 at 17:32):

You could write it down as an integral. For all measurable

p:Z[0,]p: Z \to [0, \infty]

we have

((fg);h)(a,b,dz)p(z)=f(a,dx)g(b,dy)h(x,y,dz)p(z).\int ((f \otimes g); h)(a, b, \text{d}z) p(z) = \int f(a, \text{d}x) \int g(b, \text{d}y) \int h(x, y, \text{d}z) \, p(z).

Hence,

((fg);h)(a,b,dz)=f(a,dx)g(b,dy)h(x,y,dz)((f \otimes g); h)(a, b, \text{d}z) = f(a, \text{d}x) g(b, \text{d}y) h(x, y, \text{d}z)

using the notation in https://link.springer.com/book/10.1007/978-0-387-87859-1

view this post on Zulip Tobias Fritz (Feb 25 2023 at 17:59):

Sorry if this is a bit brief, but I think that the explicit formula should simply be

((fg);h)(Sa,b)=xXyYh(Sx,y)df(xa)dg(yb).((f \otimes g) ; h)(S|a,b) = \int_{x \in X} \int_{y \in Y} h(S|x,y) \, df(x|a) \, dg(y|b).

This is basically the same equation as calculating the expectation value of a measurable function with respect to a product measure. BTW I personally prefer writing this as

(fg);h)(Sa,b)=xXyYh(Sx,y)f(dxa)g(dyb),(f \otimes g) ; h)(S|a,b) = \int_{x \in X} \int_{y \in Y} h(S|x,y) \, f(dx|a) \, g(dy|b),

with the dd's at the integration variables, which seems more intuitive.

view this post on Zulip Nathaniel Virgo (Feb 26 2023 at 01:03):

Tobias Fritz said:

Sorry if this is a bit brief, but I think that the explicit formula should simply be

((fg);h)(Sa,b)=xXyYh(Sx,y)df(xa)dg(yb).((f \otimes g) ; h)(S|a,b) = \int_{x \in X} \int_{y \in Y} h(S|x,y) \, df(x|a) \, dg(y|b).

This is basically the same equation as calculating the expectation value of a measurable function with respect to a product measure.

Thank you, I'm sure this is right. (And brevity is good.) I was confused because I was actually trying to evaluate q;(fg);hq;(f\otimes g);h for some q ⁣:1X×Yq\colon 1\to X\times Y, and I was stuck at trying to integrate h(Sx,y)h(S\mid x,y) against q;(fg)q;(f\otimes g). The latter is a measure defined on a product space but it isn't a product measure, so I couldn't do this.

But your comment made me realise I can just evaluate it as q;((fg);h)q;((f\otimes g);h) instead and then there's no problem. Maybe this will always work - I can always just evaluate things from back to front and then I'll always either be integrating against a product measure or against a measure that's fully defined instead of partially defined and then extended. I'll try it on my real problem and see if it works.

BTW I personally prefer writing this as

(fg);h)(Sa,b)=xXyYh(Sx,y)f(dxa)g(dyb),(f \otimes g) ; h)(S|a,b) = \int_{x \in X} \int_{y \in Y} h(S|x,y) \, f(dx|a) \, g(dy|b),

with the dd's at the integration variables, which seems more intuitive.

I write it the way I do because I think of xXf(x)dp(x)\int_{x\in X}f(x)\,d\,p(x) as a souped-up p,f\langle p, f\rangle (applying a 'dual vector' to a 'vector'), where the xx is a dummy variable and the dd is really just the comma. But I can see the point of the other notation and will give it a try - it might be less confusing in the end.

view this post on Zulip Nathaniel Virgo (Feb 26 2023 at 01:56):

Richard Samuelson said:

You could write it down as an integral. For all measurable

p:Z[0,]p: Z \to [0, \infty]

we have

((fg);h)(a,b,dz)p(z)=h(x,y,dz)f(a,dx)g(b,dy)p(z).\int ((f \otimes g); h)(a, b, \text{d}z) p(z) = \int h(x, y, \text{d}z) \int f(a, \text{d}x) \int g(b, \text{d}y) \, p(z).

Hence,

((fg);h)(a,b,dz)=h(x,y,dz)f(a,dx)g(b,dy)((f \otimes g); h)(a, b, \text{d}z) = h(x, y, \text{d}z) f(a, \text{d}x) g(b, \text{d}y)

using the notation in https://link.springer.com/book/10.1007/978-0-387-87859-1

This makes me start to like the dxdx notation actually - I like being able to write it implicitly like this. But I think we can only get rid of one of the integral signs, because we still have to integrate over xx and yy. So I'd write this as

((fg);h)(dza,b)=yYxXh(dzx,y)f(dxa)g(dyb),((f \otimes g); h)(dz\mid a, b) = \int_{y\in Y}\int_{x\in X}h(dz\mid x, y) f(dx\mid a) g(dy\mid b),

which is basically what Tobias Fritz wrote only with dzdz in place of SΣZS\in\Sigma_Z.

view this post on Zulip Nathaniel Virgo (Feb 26 2023 at 03:10):

I have one more question, which is as much a notation question as anything else. Suppose I have f ⁣:1XYf\colon 1\to X\otimes Y and g ⁣:XYZg\colon X\otimes Y\to Z. What's the right way to write down their composite?

It feels like I should be able to write it as

(f;g)(dz)=x,yXYg(dzx,y)f(dx,dy),(f;g)(dz\mid \bullet) = \int_{x,y\in X\otimes Y} g(dz\mid x, y) f(dx, dy\mid\bullet),

but I'm a bit uneasy about writing dxdx and dydy, because the space being integrated over is XYX\otimes Y, not XX and YY separately. Is there a formal argument that allows it to be written this way, or should it really be something like

(f;g)(dz)=x,yXYg(dzx,y)f(d(x,y))(f;g)(dz\mid \bullet) = \int_{x,y\in X\otimes Y} g(dz\mid x, y) f(d(x,y)\mid\bullet)

instead?

view this post on Zulip Tobias Fritz (Feb 26 2023 at 07:55):

Without a formal syntax in place, I'd say that both of those notations make sense since they express what you mean clearly enough. You could also write it as a double integral (by Fubini's theorem), and/or write dx×dydx \times dy in the argument of ff in order to express the idea that one sums up the measures of tiny rectangles.

view this post on Zulip Nathaniel Virgo (Feb 26 2023 at 08:32):

I guess the semantics of these things is what I'm trying to grasp towards. Or at least I want to be sure I know what I mean when I write these things down.

Can I really use Fubini's theorem here? I thought I could only use that in the case where I'm integrating over a product measure. If I write

xXyYg(Sx,y)f(dx,dy)\int_{x\in X} \int_{y\in Y} g(S\mid x,y) f(dx, dy \mid \bullet)

then it looks to me like I need to evaluate something like

yYg(Sx,y)f(T,dy)\int_{y\in Y} g(S\mid x,y) f(T, dy \mid \bullet)

and then integrate that over XX.

If that's right then what's the name of this kind of partial integral? I'm not getting much luck trying to google things like "integrating over one variable of a joint measure".

The same thing comes up in this situation: suppose I have q ⁣:AXq\colon A\otimes X and f ⁣:XYf\colon X\to Y, and I want to calculate q;(idA;f)q;(id_A;f). I end up with

(q;(idA;f))(da,dy)=a,xAXf(dyx)idA(daa)q(da,dx).(q;(id_A;f))(da,dy\mid \bullet) = \int_{a', x\in A\otimes X} f(dy\mid x) id_A(da\mid a') \,\, q(da,dx'\mid \bullet).

It seems like I should be able to integrate out aa' to get

(q;(idA;f))(da,dy)=xXf(dyx)q(da,dx).(q;(id_A;f))(da,dy\mid \bullet) = \int_{x\in X} f(dy\mid x) \,\, q(da,dx\mid \bullet).

but then I'm back to integrating only over XX even though qq is a measure over AXA\otimes X, and I don't know what it means formally.

view this post on Zulip Tobias Fritz (Feb 26 2023 at 09:11):

Urgh, sorry, I'm a bit out of it today -- please ignore my reference to Fubini, you're right that it's not that.

view this post on Zulip Richard Samuelson (Feb 26 2023 at 19:09):

Nathaniel Virgo said:

Richard Samuelson said:

You could write it down as an integral. For all measurable

p:Z[0,]p: Z \to [0, \infty]

we have

((fg);h)(a,b,dz)p(z)=h(x,y,dz)f(a,dx)g(b,dy)p(z).\int ((f \otimes g); h)(a, b, \text{d}z) p(z) = \int h(x, y, \text{d}z) \int f(a, \text{d}x) \int g(b, \text{d}y) \, p(z).

Hence,

((fg);h)(a,b,dz)=h(x,y,dz)f(a,dx)g(b,dy)((f \otimes g); h)(a, b, \text{d}z) = h(x, y, \text{d}z) f(a, \text{d}x) g(b, \text{d}y)

using the notation in https://link.springer.com/book/10.1007/978-0-387-87859-1

This makes me start to like the dxdx notation actually - I like being able to write it implicitly like this. But I think we can only get rid of one of the integral signs, because we still have to integrate over xx and yy. So I'd write this as

((fg);h)(dza,b)=yYxXh(dzx,y)f(dxa)g(dyb),((f \otimes g); h)(dz\mid a, b) = \int_{y\in Y}\int_{x\in X}h(dz\mid x, y) f(dx\mid a) g(dy\mid b),

which is basically what Tobias Fritz wrote only with dzdz in place of SΣZS\in\Sigma_Z.

My original post was incorrect (I just edited it). I believe that the expression can be written

((fg);h)(a,b,dz)=f(a,dx)g(b,dy)h(x,y,dz),((f \otimes g); h)(a, b, \text{d}z) = f(a, \text{d}x) g(b, \text{d}y) h(x, y, \text{d}z),

without integral signs, and where hh is to the right of ff and gg. For example, see page 44 of Cinlar, Probability and Stochastics:

cinlar.png