Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: learning: questions

Topic: Markov category question


view this post on Zulip Bruno Gavranović (Apr 20 2021 at 10:20):

If we have a monoidal monad DD, it comes equipped with lax morphisms μX,Y:DX×DYD(XY)\mu_{X,Y} : DX \times DY \to D(X \otimes Y) which, in the case of probability theory, we interpret as the embedding of pair of distributions, one on XX and one on YY into a joint distribution on XYX \otimes Y.

I'm reading about Markov categories which seem to give a synthetic view on probability theory and I'm wondering: can I see the above embedding purely in terms of equations in a Markov category? I can't seem to see how it could be done

view this post on Zulip Jules Hedges (Apr 20 2021 at 10:23):

In a Markov category you also have "marginalisation" maps D(XY)DX×DYD (X \otimes Y) \to D X \times D Y and it's a one-sided inverse to μ\mu - if you start with a pair of distributions, stick them together than the marginalise them apart again then you get back what you started with, but the other way round isn't true. Is that what you need?

view this post on Zulip Bruno Gavranović (Apr 20 2021 at 10:26):

Hm, I guess it's confusing since you're in an arbitrary Markov category but you're referring to objects as D(X)D(X), i..e using the monad DD

view this post on Zulip Jules Hedges (Apr 20 2021 at 10:34):

Oh yeah. Right. Your μ\mu vanishes into the monoidal structure of the Markov category, you see it directly eg. if you have a state IXI \to X and a state IYI \to Y you tensor them together into a state IXYI \to X \otimes Y. I think the internal way to say it would be that if you have a pair of morphisms f:XYf : X \to Y and g:XYg : X' \to Y' then
XXfgYYπYX \otimes X' \overset{f \otimes g}\longrightarrow Y \otimes Y' \overset{\pi}\longrightarrow Y
equals ff

view this post on Zulip Bruno Gavranović (Apr 20 2021 at 10:38):

Right, I guess the confusing bit is that in Kl(D)Kl(D) if I have two objects A,BA, B I can do two things: either first take their monoidal product ABA \otimes B and then consider D(AB)D(A \otimes B), or I can take their distributions separately and then take their cartesian product D(A)×D(B)D(A) \times D(B). And then there's a lax morphism between them.

But in an arbitrary Markov category if I have two objects A,BA, B I can just take their tensor product ABA \otimes B. And I guess the other analogue doesn't exist? This might or might not be related to your last message.

view this post on Zulip Nathaniel Virgo (Apr 20 2021 at 12:52):

One thing we can say is that in a Markov category, a distribution on XX is a morphism 1pX1\xrightarrow{p}X, where 11 is the monoidal unit. If we have a distribution pp on XX and qq on YY, we can form a distribution on XYX\otimes Y given by 1pqXY1\xrightarrow{p\otimes q}X\otimes Y, which has marginals on XX and YY given by pp and qq, and XX and YY are independent. In string diagrams it's

image.png

Of course this just corresponds to one element of DX×DYDX\times DY being mapped into D(XY)D(X\otimes Y). If you want a morphism corresponding to the mapping itself then I think you have to use the language of the representable Markov categories paper, which lets you talk about things like DXDX ("the object of distributions over XX") as an object in the Markov category, if the category is such that they exist.

view this post on Zulip Bruno Gavranović (Apr 20 2021 at 18:06):

I see. I understand the case with single distributions.
So you're saying to think of a distribution on XX as a map 1X1 \to X in a Markov category. I guess one could get close to something I wanted by thinking of the set of maps C(1,X)\mathcal{C}(1, X) and C(1,Y)\mathcal{C}(1, Y) and relating it to the set C(1,XY)\mathcal{C}(1, X \otimes Y).

But somehow that still doesn't feel satisfying. I should probably read the paper you linked, but skimming it for a bit I'm not sure where to start... is there a particular theorem there that relates to what I said?

view this post on Zulip Tobias Fritz (Apr 20 2021 at 19:18):

As @Nathaniel Virgo already mentioned, the formation of product distributions, and more generally products of Markov kernels, corresponds in the Markov category picture simply to their monoidal product. You can think of this quite explicitly in terms of the relevant string diagram: putting two morphisms side by side makes them intuitively "independent" because the diagram is disconnected, and this independence is exactly what formalizes stochastic independence. (Compare with tensor products in Vect\mathsf{Vect}, where you'd also multiply the entries of two matrices pointwise in order to get their tensor product, and note that that corresponds to the multplication of probabilities in the definition of stochastic independence.)

There's more on the monad side of this stuff in Bimonoidal Structure of Probability Monads, while our more recent paper on representable Markov categories doesn't go into much detail on these aspects.

Is this going in the direction that you've had in mind?

view this post on Zulip Nathaniel Virgo (Apr 21 2021 at 00:08):

Bruno Gavranovic said:

I should probably read the paper you linked, but skimming it for a bit I'm not sure where to start... is there a particular theorem there that relates to what I said?

I meant that since a representable Markov category has an object PXPX of probability distributions over XX for every XX, then there should be a morphism of type PXPYP(XY)PX\otimes PY \to P(X\otimes Y) that embeds PXPX and PYPY into P(XY)P(X\otimes Y) in the same way as the lax monoidal structure of a probability monad. This morphism is constructed in the proof of proposition 3.12 in the paper.

So you're saying to think of a distribution on XX as a map 1X1 \to X in a Markov category. I guess one could get close to something I wanted by thinking of the set of maps C(1,X)\mathcal{C}(1, X) and C(1,Y)\mathcal{C}(1, Y) and relating it to the set C(1,XY)\mathcal{C}(1, X \otimes Y).

I was thinking about this last night. It generalises slightly if we consider a map θX\theta\to X as a family of distributions parametrised by θ\theta. Then there's a map from the sets C(θX,X)\mathcal{C}(\theta_X, X) and C(θY,Y)\mathcal{C}(\theta_Y, Y) and relating it to the set C(θXθY,XY)\mathcal{C}(\theta_X\otimes\theta_Y, X \otimes Y) of parametrised families of independent distributions. This map is actually just the action of the functor  ⁣:C×CC\otimes\colon \mathcal{C}\times\mathcal{C}\to\mathcal{C} on morphisms. (I guess that's just what Tobias Fritz said above.)

view this post on Zulip Bruno Gavranović (May 01 2021 at 13:59):

Hi, just wanted to say, thanks for all the answers!
I've been thinking about this but I'm not sure does this answer my question, there's some confusion in my mind and I'm not even sure what the confusion is.
But somehow I've started doing other stuff now so I might come back to this question sometime when I feel more ready :grinning: