Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: learning: questions

Topic: Tensor product of convex spaces


view this post on Zulip fosco (Oct 29 2022 at 22:38):

Apologies in advance because this question will turn out to be fairly technical.

Lately I have been studying a lot the properties of the distribution monad $D$ defined here. I am stuck in trying to understand a fairly mysterious claim made in this paper by Bart Jacobs, where at the end of page 7 he claims that "Moreover, D(A×B)DADBD(A\times B)\cong DA\otimes DB."

I am having trouble understanding this claim, and there must be a mistake in this line of reasoning, but I can't find where.

Where is the mistake? Is there another monoidal closed structure on the category of convex spaces, different from the cartesian one (which, I believe, is not closed, so I'm even more puzzled if possible) induced from an oplax structure on DD that I cannot find?

Inspecting the failure of the map D(A×B)DA×DBD(A\times B)\to DA\times DB to be invertible, one sees that it is surjective, but not injective: "splitting" a distribution on A×BA\times B into marginalizations forgets all "correlation" (apologies if I am misusing probability theory lingo!) and as a result we can't reconstruct a distribution on A×BA\times B given only its marginalizations. It then seems that DADBDA\otimes DB receives a "universal bilinear map" DA×DBDADBDA\times DB \to DA\otimes DB in a similar fashion to tensor product of vector spaces (and maybe still as a quotient of [0,1]DA×DB[0,1]^{DA\times DB} under suitable relations?)... and this is where my speculations became too handwavy, I stopped, and I got lost.

I really have no idea how the map in question, and even less the isomorphism that Jacobs claims to exist, is found.

view this post on Zulip fosco (Oct 29 2022 at 22:39):

(PS: I hereby invoke my categorical probability gurus @Paolo Perrone , and @Tobias Fritz :-) )

view this post on Zulip Tobias Fritz (Oct 30 2022 at 06:05):

It looks like you got all the ingredients to resolve this conundrum, so here's a hint: yes, the two monoidal structures are different! And the analogy with vector spaces is a good way to think about it, since it works just the same for convex spaces (and for algebras of commutative monads on cartesian closed cats quite generally, by results of Kock).

The tensor product has the universal property of turning "biaffine" maps into ordinary affine maps, which are the morphisms of convex spaces. To show D(A×B)DADBD(A \times B) \cong DA \otimes DB, it should be possible to prove that the laxator DA×DBD(A×B)DA \times DB \to D(A \times B) has the universal property of the universal biaffine map DA×DBDADBDA \times DB \to DA \otimes DB, based on the universal properties of the free algebras DADA and DBDB.

Let us know if it makes some more sense now.

view this post on Zulip fosco (Oct 30 2022 at 09:10):

So, let's see if I understand what's going on: in order to account for "affine bilinearity" I thought one has to consider an equivalence relation that identifies

and maybe something else that mimicks r(xy)=(rx)y=x(ry)r(x\otimes y)=(rx)\otimes y=x\otimes(ry)...

I don't know what confuses me, probably these relations already hold in D(X×Y)D(X\times Y) as a consequence of it being a free algebra? Or maybe the fact that DX×DYD(X×Y)DX\times DY\to D(X\times Y) is a "stupid" map sending the pair (p,q)(p,q) to (x,y)p(x)q(y)(x,y)\mapsto p(x)q(y), and I don't see how this can have the right universal property: a biaffine map φ:DX×DYZ\varphi : DX\times DY\to Z induces an affine map φˉ:D(X×Y)Z\bar\varphi : D(X\times Y)\to Z, probably sending hD(X×Y)h\in D(X\times Y) to φ(hy,hx)\varphi(h_y,h_x) where hx:=xh(x,),hy:=yh(,y)h_x := \sum_x h(x,-), h_y:=\sum_y h(-,y).

view this post on Zulip Tobias Fritz (Oct 30 2022 at 16:14):

Yep, I think you're on the right track with that final sentence! Note that it's enough to establish a bijection between biaffine maps DX×DYZDX \times DY \to Z and arbitrary maps X×YZX \times Y \to Z that is natural in ZZ. To simplify this problem further, it's enough to establish a natural bijection between either of these sets of maps and the set of maps X×DYZX \times DY \to Z that are affine in the second argument. And that should be pretty clear.

Concerning the things before that, I'm not sure since I don't understand your triples in angle bracket notation. What do those mean?

view this post on Zulip John Baez (Oct 30 2022 at 17:13):

Is there another monoidal closed structure on the category of convex spaces, different from the cartesian one?

One well-known monoidal closed structure on convex spaces, where the internal hom is the convex space of all convex linear maps, is not cartesian but only semicartesian.

view this post on Zulip Tobias Fritz (Oct 30 2022 at 20:09):

Right, and that's the monoidal structure that Fosco has denoted \otimes. It's quite intriguing that all the things that all the standard properties of the tensor product of vector spaces generalize to algebras of commutative monads. And then the case of convex spaces is another instance of that. (The fact that the monoidal structure is semicartesian is because the free convex space monad DD has the additional property of being affine, meaning D11D1 \cong 1. That's clearly not true for the vector space monad!)

view this post on Zulip fosco (Oct 30 2022 at 20:18):

I'm starting to see all these things through a unified lens, indeed. I never noticed that one can intuitively motivate the machinery of strong monads using vector spaces as analogue and a canonical laxator τ:TV×TWT(V×W)\tau : TV\times TW\to T(V\times W) ;-) so cool.

(OT : Since we're here discussing this sub-topic, is there a fancy explanation for the nomenclature <<strong monad>> as well? What exactly is "strong" in a strong monad?)

view this post on Zulip Ralph Sarkis (Oct 30 2022 at 20:30):

In my headcanon, the terminology comes from the fact that giving a strength to an endofunctor on V\mathcal{V} is the same thing as enriching it over V\mathcal{V}, and both those words have a close meaning in real life. (I don't know if that is the original intention)

view this post on Zulip fosco (Oct 30 2022 at 20:32):

"Mastering others is a tensorial strength;
being enriched is true power.”
― Lao Tzu, Tao Te Ching

view this post on Zulip fosco (Oct 30 2022 at 20:32):

(couldn't resist the pun)

view this post on Zulip fosco (Oct 30 2022 at 20:33):

nice headcanon anyway :-)

view this post on Zulip Amar Hadzihasanovic (Oct 31 2022 at 08:33):

It seems that in Kock's original paper, a “functor” between V\mathcal{V}-enriched categories is a functor of the underlying (un-enriched) categories, and a “strong functor” is a V\mathcal{V}-enriched functor.

view this post on Zulip Paolo Perrone (Oct 31 2022 at 09:43):

There is some relevant material at this nlab page, but feel free to add!

view this post on Zulip Paolo Perrone (Oct 31 2022 at 09:46):

Also here, from the closed point of view.

view this post on Zulip Jules Hedges (Oct 31 2022 at 10:43):

I find the tensor product of convex spaces an absolute nightmare to compute with. The other day I was chatting with @John Baez and said I think [0,1][0,1][0,1] \otimes [0,1] is infinite dimensional, and John pointed out what I totalled missed that since [0,1]=D(2)[0,1] = D(2), so [0,1][0,1]=D(4)[0,1] \otimes [0,1] = D(4) is a tetrahedron... but I don't know how you'd figure that out without using the universal property. So for example I have absolutely not a clue what RR\mathbb R \otimes \mathbb R looks like, and I still suspect it might be infinite dimensional...

view this post on Zulip Spencer Breiner (Oct 31 2022 at 12:42):

I think that if XX and YY are mm- and nn-dimensional, then XYX\otimes Y embeds into Rm+n+1\mathbb{R}^{m+n+1}.

You can imagine it as embedding the two smaller pieces as skew to one another in the higher-dimensional space, and then drawing all line segments that connect the two.

view this post on Zulip Oscar Cunningham (Oct 31 2022 at 12:52):

So RR\Bbb{R}\otimes\Bbb{R} looks like R2×(0,1)\Bbb{R}^2\times(0,1) plus two skew lines, one on each face?

view this post on Zulip Spencer Breiner (Oct 31 2022 at 12:54):

I was envisioning it as a tetrahedron with vertices and some edges removed.

view this post on Zulip Spencer Breiner (Oct 31 2022 at 12:55):

The skew lines correspond to the two copies of R\mathbb{R}?

view this post on Zulip Spencer Breiner (Oct 31 2022 at 12:56):

I think that you've got it right

view this post on Zulip Tobias Fritz (Oct 31 2022 at 13:08):

Spencer Breiner said:

I think that if XX and YY are mm- and nn-dimensional, then XYX\otimes Y embeds into Rm+n+1\mathbb{R}^{m+n+1}.

You can imagine it as embedding the two smaller pieces as skew to one another in the higher-dimensional space, and then drawing all line segments that connect the two.

It sounds like you're talking about the join of convex spaces, which is best known in the special case of join of polytopes. That is actually the coproduct of convex spaces, which is a third monoidal structure that is not isomorphic to either the cartesian one or the tensor!

view this post on Zulip Tobias Fritz (Oct 31 2022 at 13:09):

You can see this from the fact that your dimension is (up to the constant) additive, while for the tensor product it should be multiplicative.

view this post on Zulip Tobias Fritz (Oct 31 2022 at 13:11):

(At least to leading order.)

view this post on Zulip Tobias Fritz (Oct 31 2022 at 13:11):

If XX embeds into Rm\mathbb{R}^m and YY embeds into Rn\mathbb{R}^n, then their tensor product embeds into Rmn+m+n\R^{mn+m+n}.

view this post on Zulip Tobias Fritz (Oct 31 2022 at 13:12):

This should be a consequence of RmRnRmn+m+n\mathbb{R}^m \otimes \mathbb{R}^n \cong \mathbb{R}^{mn + m + n}.

view this post on Zulip fosco (Oct 31 2022 at 13:33):

Tobias Fritz said:

If XX embeds into Rm\mathbb{R}^m and YY embeds into Rn\mathbb{R}^n, then their tensor product embeds into Rmn+m+n\R^{mn+m+n}.

Ah! A wild Segre embedding appears! There's also a discussion on how the map arises from a lax monoidal structure over the projective space functor!

view this post on Zulip Tobias Fritz (Oct 31 2022 at 13:34):

These ugly dimension counts (and many other aspects of convex spaces as well) become much simpler if one works with convex cones rather than convex spaces. Here, by a convex cone I simply mean an algebra of the nonnegative linear combinations monad, which is the same as the distribution monad but with the normalization condition dropped. Then the free object on nn generators really is nn-dimensional (it's just R+n\mathbb{R}_+^n) rather than weirdly (n1)(n-1)-dimensional, and the dimensions really just multiply under tensor and add under coproduct. Moreover, the coproduct is at the same time the product, so convex cones actually form a category with biproducts! They're much closer to vector spaces, which one can also understand by noting that they're the semimodules over the rig of nonnegative reals.

I'm generally not a fan of abandoning one category in favour of another one just because the latter is better behaved, but in this case it seems to be warranted: most things that one wants to do with convex spaces can be done more elegantly with convex cones instead. This is closely related to homogenization tricks.

view this post on Zulip Paolo Perrone (Oct 31 2022 at 13:35):

Indeed, the "Segre embedding" for the distribution monad is used in algebraic statistics to describe the variety of independent joints. For us, that's the (image of) the lax monoidal structure of the monad.

view this post on Zulip Tobias Fritz (Oct 31 2022 at 13:38):

Interesting connection! I think the dimension count is indeed the same as in the Segre embedding, but the reason for the dimension shift is a bit different: in projective space, we lose a dimension because of quotienting. For the distribution monad, we lose a dimension because DXDX is a subobject of R+X\mathbb{R}^X_+ determined by the normalization condition. So it's "sub" rather than "quotient".

view this post on Zulip fosco (Oct 31 2022 at 13:38):

@Tobias Fritz that's exactly how I started seeing this story, making sense of some constructions in affine geometry, and finding that some of them are universal constructions for convex spaces; there are some interesting adjunctions between semimodules and convex sets, as well as convex sets and semilattices. (I was about to rediscover some of them independently from Jacobs, which was a pleasant find)

view this post on Zulip fosco (Oct 31 2022 at 13:40):

I would have said that the point of the story is that an nn-dimensional projective space can be covered by n+1n+1 affine spaces "somewhat canonically" (not to be intended literally, I just mean that the choice of affine charts is the same in every dimension), and each of these affine spaces has a choice of convex structure

view this post on Zulip Paolo Perrone (Oct 31 2022 at 13:40):

Tobias Fritz said:

Interesting connection! I think the dimension count is indeed the same as in the Segre embedding, but the reason for the dimension shift is a bit different: in projective space, we lose a dimension because of quotienting. For the distribution monad, we lose a dimension because DXDX is a subobject of R+X\mathbb{R}^X_+ determined by the normalization condition. So it's "sub" rather than "quotient".

This might be a little bit of a sidetrack, but I still have hope to find a nice Markov category that's based on quotients rather than subobjects.

view this post on Zulip fosco (Oct 31 2022 at 13:41):

And now, the product of two coverings made of n+1 and m+1 pieces forms a covering made of...

view this post on Zulip Spencer Breiner (Oct 31 2022 at 13:46):

Thanks for the correction, Tobias! I should have thought through that a bit more :blush:

view this post on Zulip John Baez (Nov 01 2022 at 17:18):

Jules Hedges said:

I find the tensor product of convex spaces an absolute nightmare to compute with. The other day I was chatting with John Baez and said I think [0,1][0,1][0,1] \otimes [0,1] is infinite dimensional, and John pointed out what I totalled missed that since [0,1]=D(2)[0,1] = D(2), so [0,1][0,1]=D(4)[0,1] \otimes [0,1] = D(4) is a tetrahedron... but I don't know how you'd figure that out without using the universal property. So for example I have absolutely not a clue what RR\mathbb R \otimes \mathbb R looks like, and I still suspect it might be infinite dimensional...

Let me try to guess what it is.

First, I would guess that R\mathbb{R} as a convex space is the colimit of the convex subspaces [N,N]R[-N,N] \subseteq \mathbb{R}. I don't know if this is true, but suppose it is! Since tensoring with anything is a left adjoint, \otimes distributes over colimits. So, it seems RR \mathbb{R} \otimes \mathbb{R} should be the colimit of the tetrahedra [N,N][N,N][-N,N] \otimes [-N,N].

view this post on Zulip John Baez (Nov 01 2022 at 17:20):

So we get the colimit of bigger and bigger tetrahedra - for specificity, imagine making a regular tetrahedron bigger and bigger by rescaling it, and take the union of all these tetrahedra.

view this post on Zulip John Baez (Nov 01 2022 at 17:21):

This sounds like R4\mathbb{R}^4 to me. So I think the answer is R4\mathbb{R}^4 with its usual convex structure.

view this post on Zulip John Baez (Nov 01 2022 at 17:23):

Now I'm reading the whole thread, and I see this agrees with Tobias' answer.

view this post on Zulip Tobias Fritz (Nov 01 2022 at 18:10):

That's a great way to see it ­-- assuming that you mean R3\mathbb{R}^3 rather than R4\mathbb{R}^4 :wink: Since the tetrahedra are 3-dimensional.

view this post on Zulip John Baez (Nov 01 2022 at 18:24):

Yes, I meant R3\mathbb{R}^3. The 4 corners of the tetrahedron took over my soul and made my type R4\mathbb{R}^4.

view this post on Zulip John Baez (Nov 01 2022 at 18:24):

But that's the usual sort of fencepost error that tends to kick in with convex sets!

view this post on Zulip John Baez (Nov 01 2022 at 18:26):

Or simplexes, for that matter: the nn-simplex having n+1n+ 1 vertices.