Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: event: Categories for AI

Topic: Week 3: Categorical Dataflow: Lenses and Optics


view this post on Zulip Bruno Gavranović (Oct 20 2022 at 09:36):

This is the official topic for the third lecture of the course: "Categorical Dataflow: Optics and Lenses as data structures for backpropagation"

As a reminder, by the end of this week you will:

This lecture will help explain key parts of the paper Categorical Foundations of Gradient-Based Learning (Cruttwell et al., ESOP 2022) and the accompanying blog post.

view this post on Zulip Pim de Haan (Oct 20 2022 at 09:44):

The zoom link for the lecture is https://uva-live.zoom.us/j/83816139841

view this post on Zulip Bruno Gavranović (Oct 24 2022 at 11:49):

Hi all,

I'm very excited for our third lecture of the course today! We'll take a look at monoidal categories, cartesian categories, lenses and optics, and see how these constructions help us understand various computation patterns happening inside neural networks.

Here's some last minute information (also communicated in the Google groups email.)

The lecture starts in little over 3 hours (at 4PM UK time) at the following Zoom link: https://uva-live.zoom.us/j/83816139841 As before, it will be streamed to Youtube (https://www.youtube.com/watch?v=p4iRU4pBkCo) and recorded.

The lecture is expected to last up to 2 hours (including Q&A). There's a lot of ground to cover, and this will give us enough time thoroughly unpack these abstract constructions.

See you soon!

view this post on Zulip Bruno Gavranović (Oct 24 2022 at 14:43):

We'll be starting in about 15 minutes! The lecture slides are available here, though many slides have content that I plan to fill in during the lecture. After it's filled in I'll update the slides accordingly.

view this post on Zulip Sichu Lu (Oct 24 2022 at 14:51):

Is there a mistake on the zoom link, I clicked on it and it says it's scheduled for tomorrow?

view this post on Zulip Nitin Kishore Sai Samala (Oct 24 2022 at 15:02):

The zoom link says meeting ID is expired. Is there a new link?

view this post on Zulip Ieva Cepaite (Oct 24 2022 at 15:05):

Nitin Kishore Sai Samala said:

The zoom link says meeting ID is expired. Is there a new link?

Works fine for me! Which link are you using?

view this post on Zulip Pim de Haan (Oct 24 2022 at 15:05):

The link should be https://uva-live.zoom.us/j/83816139841

view this post on Zulip Nitin Kishore Sai Samala (Oct 24 2022 at 15:08):

My Google calendar invite had this link https://uva-live.zoom.us/j/87524053828

Ieva Cepaite said:

Nitin Kishore Sai Samala said:

The zoom link says meeting ID is expired. Is there a new link?

Works fine for me! Which link are you using?

view this post on Zulip Ieva Cepaite (Oct 24 2022 at 15:09):

Nitin Kishore Sai Samala said:

My Google calendar invite had this link https://uva-live.zoom.us/j/87524053828

Ieva Cepaite said:

Nitin Kishore Sai Samala said:

The zoom link says meeting ID is expired. Is there a new link?

Works fine for me! Which link are you using?

Ah okay, the one in the email is different, see Pim's reply above:

Pim de Haan said:

The link should be https://uva-live.zoom.us/j/83816139841

view this post on Zulip Bruno Gavranović (Oct 24 2022 at 17:14):

Thanks to everyone for listening and asking questions! I'm sorry I didn't manage to answer all the questions, in the beginning my zoom chat was overloading with new information. I'm happy to answer them here.

I think there was the most confusion about the kind graphical transformations that are allowed for monoidal categories. This is certainly a nuanced point, and for now I'll just share the Categorical Quantum Information lecture slides I referred to in the course here, since I need a proper break after a 2h lecture :sweat_smile:

view this post on Zulip Ieva Cepaite (Oct 24 2022 at 17:17):

Bruno Gavranovic said:

Thanks to everyone for listening and asking questions! I'm sorry I didn't manage to answer all the questions, in the beginning my zoom chat was overloading with new information. I'm happy to answer them here.

I think there was the most confusion about the kind graphical transformations that are allowed for monoidal categories. This is certainly a nuanced point, and for now I'll just share the lecture slides here, since I need a proper break after a 2h lecture :sweat_smile:

I'll share both the lecture notes and tutorials/solutions for the course (hopefully no one ends up taking Chris Heuenen's categorical quantum information course here :sweat_smile: ). They're focused on quantum systems and qubits and such, but they have a lot of insights for monoidal categories! Categories.pdf TutorialsSolutions.pdf

view this post on Zulip Ieva Cepaite (Oct 24 2022 at 17:20):

Ah, I need to mention the context again that the whole exercise about sliding things around on a plane was taken from this course, so it might be helpful to read and compare/contrast

view this post on Zulip Matthew Pugh (Oct 24 2022 at 20:39):

Concerning the question about a homomorphic supply of monoids

view this post on Zulip Alexandros Keros (Oct 24 2022 at 23:11):

Could we please have access to the populated slides?

view this post on Zulip Andrew Dudzik (Oct 25 2022 at 10:08):

Matthew Pugh said:

Concerning the question about a homomorphic supply of monoids


I'm not very familiar with "supply" and I'm waiting on the populated slides to look at the definition, but I'm familiar with a different phrasing of this theorem, that every monoidal category has a "universal cartesianization" given by the category of comonoids, and dually a universal cocartesianization given by the category of monoids. These two statements should be equivalent by dualizing.

And yes, cocartesian categories are less intuitive than cartesian ones when you're used to working with sets and types. Generally cartesian categories are more "geometric" and cocartesian ones are more "algebraic"; for example, the opposite of the category of sets with product is cocartesian, and it's equivalent to the category of complete atomic boolean algebras with tensor product.

Tensor products of algebras provide a whole host of examples. A favorite of mine is the category of commutative rings, where the tensor product is the categorical coproduct. We can also look at abelian categories, e.g. the category of abelian groups with direct sum, which are both cartesian and cocartesian.

I'll cover some of this in my lecture in 2 weeks. I think of monoids in terms of data aggregation, so anything like a concatenation, a running total, or an RNN, should be a good place to look to understand cocartesian categories.

view this post on Zulip Bruno Gavranović (Oct 25 2022 at 11:57):

Alexandros Keros said:

Could we please have access to the populated slides?

I'll have them up on the next day or two! I want to clean things up a bit

view this post on Zulip Bruno Gavranović (Oct 25 2022 at 12:05):

Matthew Pugh said:

Concerning the question about a homomorphic supply of monoids


That's exactly right! Every object A:CA : \mathcal{C} becomes a commutative monoid in a unique way, i.e. it becomes equipped with a unique map A:A+AA\nabla_A : A + A \to A "merging" two AA's into one and a unique initial map 0A0 \to A. This is what characterises cocartesian categories (category with a monoidal product given by coproduct): we can always systematically "merge" objects, and "create" objects (though I'm not too happy with this terminology)

view this post on Zulip Bruno Gavranović (Oct 25 2022 at 12:12):

Note that the category (Euc,×,1)(\mathbf{Euc}, \times, 1) has a supply of monoids on every object given by ++ and 00 (i.e. it's a monoid given by addition with 0 as the netural element), but interestingly this supply is not homomorphic: given an arbitrary smooth function f:RRf : \mathbb{R} \to \mathbb{R} it's not always the case that f(x+x)=f(x)+f(x)f(x + x) = f(x) + f(x), and it's not always the case that f(0)=0f(0) = 0. This would imply all our functions are additive, when clearly we have things like f(x)=x2f(x)=x^2 which violate this.

view this post on Zulip Bruno Gavranović (Oct 25 2022 at 12:13):

So depending on what kinds of things we study, we might encounter various supplies that may or may not be homomorphic.

view this post on Zulip Bruno Gavranović (Oct 25 2022 at 12:14):

Andrew Dudzik said:

I'm not very familiar with "supply"

The terminology of "supply" arises from Fong & Spivak's Supplying bells & whistles in symmetric monoidal categories paper.

view this post on Zulip Bruno Gavranović (Oct 26 2022 at 09:26):

Here are the populated slides from the last lecture.
I added an extra page unpacking the equational form of the exercise diagrams.

view this post on Zulip Dan Oneață (Oct 27 2022 at 10:52):

Bruno Gavranovic said:

That's exactly right! Every object A:CA : \mathcal{C} becomes a comonoid in a unique way, i.e. it becomes equipped with a unique map A:A+AA\nabla_A : A + A \to A "merging" two AA's into one and a unique initial map 0A0 \to A. This is what characterises cocartesian categories (category with a monoidal product given by coproduct): we can always systematically "merge" objects, and "create" objects (though I'm not too happy with this terminology)

Would you mind expanding a bit more on your answer? I think your answer treats only one direction of the isomorphism: given a cocartesian monoidal category, we obtain a (homomorphic) supply of monoids. But if we wanted to show the other direction, how should one proceed? As far as I understand (and correct me if I'm wrong), we would make the following assumptions:

Given these ingredients, I guess we need to show that \otimes is the coproduct and II is the initial object? What would be a way of doing that? By checking the universal properties of the two objects?

One more minor note; you said:

Every object A:CA : \mathcal{C} becomes a comonoid in a unique way

Should it have been "monoid" instead "comonoid"?

view this post on Zulip Matteo Capucci (he/him) (Oct 27 2022 at 11:14):

Dan Oneață said:

Would you mind expanding a bit more on your answer? I think your answer treats only one direction of the isomorphism: given a cocartesian monoidal category, we obtain a (homomorphic) supply of monoids. But if we wanted to show the other direction, how should one proceed? As far as I understand (and correct me if I'm wrong), we would make the following assumptions:

Given these ingredients, I guess we need to show that \otimes is the coproduct and II is the initial object? What would be a way of doing that? By checking the universal properties of the two objects?

Excellent question! Yeah you can go the other way around, and your proof strategy (showing the universal property hold) sounds good.

view this post on Zulip Matteo Capucci (he/him) (Oct 27 2022 at 11:15):

I suggest first looking at how the universal property of coproducts gives rise to the monoid structure on any given object. Then reverse-engineering that should be easier.

view this post on Zulip Matteo Capucci (he/him) (Oct 27 2022 at 11:16):

Dan Oneață said:

One more minor note; you said:

Every object A:CA : \mathcal{C} becomes a comonoid in a unique way

Should it have been "monoid" instead "comonoid"?

Yeah I think that was just a typo (it's a common slip of the tongue to say 'comonoid' when you mean 'commutative monoid', I do that all the time unfortunately)

view this post on Zulip Bruno Gavranović (Oct 27 2022 at 12:58):

Indeed, that was a typo, thanks!

view this post on Zulip Bruno Gavranović (Oct 27 2022 at 13:14):

Dan Oneață said:

Would mind expanding a bit more on your answer?

Happy to elaborate.

Dan Oneață said:

Given these ingredients, I guess we need to show that \otimes is the coproduct and II is the initial object? What would be a way of doing that? By checking the universal properties of the two objects?

I'll tell you how to do that for the cartesian case (i.e. proving that a symmetric monoidal category with a homomorphic supply of comonoids is cartesian), and you can try to dualise it. Basically the idea is to use the copy map of the comonoid in the universal morphism definition, and the counit map in the projections. But let me take it step by step.

We want to prove that for every pair of objects AA and BB the object ABA \otimes B comes with two projection maps πA:ABA\pi_A : A \otimes B \to A and πB:ABB\pi_B: A \otimes B \to B such that for any other object XX and maps f:XAf : X \to A and g:XBg : X \to B there is a unique map of type XABX \to A \otimes B such that the appropriate diagram (the one drawn in the lecture) commutes.

The proof involves two parts. The first one is to show that the maps πA\pi_A and πB\pi_B exist. This uses the natural counit map of the comonoid TA:AIT_A : A \to I. Then we can form the composite ABTABIAAA \otimes B \xrightarrow{T_A \otimes B} I \otimes A \cong A which will be our πA\pi_A. Similar strategy follows for πB\pi_B.

We now need to show that there is a unique factorization as described above, and this is where we use the copy of the comonoid. The idea is: we have two maps f:XAf : X \to A and g:XBg: X \to B, i.e. we want to use XX twice! So to produce the map XABX \to A \otimes B, we first copy the XX and then apply individually ff and gg on components. I.e. we define it as a composite XΔXXXfgABX \xrightarrow{\Delta_X} X \otimes X \xrightarrow{f \otimes g} A \otimes B. We then need to show the diagrams commute, and that the map we defined is unique. For the latter, the kind of proof we usually want is to assume that if there's any other map XABX \to A \otimes B satisfying the commuting diagram property, it has to be equal to the map we defined.

view this post on Zulip Bruno Gavranović (Oct 27 2022 at 13:23):

Now, it's good to ponder where the fact that the supply of comonoids is homomorphic is used!

view this post on Zulip Dan Oneață (Oct 27 2022 at 19:47):

Thanks a lot, Bruno and Matteo ! That's very clear! I've also made progress on the dual statement, once Matteo confirmed that I'm on the right path.

Now, it's good to ponder where the fact that the supply of comonoids is homomorphic is used!

I think that one of the homomorphisms equations is used to show that the product diagram commutes. We have to prove that XΔXXXfgABATBAIAX \xrightarrow{\Delta_X} X \otimes X \xrightarrow{f \otimes g} A \otimes B \xrightarrow{A \otimes T_B} A \otimes I \cong A reduces to XfAX \xrightarrow{f} A. This can be more easily viewed as a string diagram.

PXL_20221027_191534622.jpg

Here, step 1 uses the comonoid homomorphism equation TBg=TXT_B \circ g = T_X, while step 2 uses the comonoid law ΔX(1XTX)λX=1X\Delta_X \circ (1_X \otimes T_X) \circ \lambda_X = 1_X (?).

Still, I'm still not sure where the other homomorphism law is used. Maybe to show the uniqueness?

One final question: do we also need to prove that the unit object II is the terminal object in C\mathcal{C}?

view this post on Zulip Dan Oneață (Oct 27 2022 at 20:16):

A somewhat related question; in The Simple Essence of Automatic Differentiation, Conal specifies a Cartesian category using what seems to me a mix of the two approaches discussed above (section 4.3 in the paper):

class Monoidal k => Cartesian k where
    exl :: (a × b) `k` a
    exr :: (a × b) `k` b
    dup :: a `k` (a × a)

So instead of fork :: (a `k` c) × (a `k` d) → (a `k` (c × d)) he is using dup :: a `k` (a × a). Is this yet another way of specifying the same structure? Any ideas why he opts for this approach? Is it more natural to use dup than fork?

By the way, has anyone here read this paper? It's a really nice paper, but I've been left with some questions, so I would really appreciate if someone is willing to share their thoughts with me.

view this post on Zulip Bruno Gavranović (Oct 29 2022 at 08:02):

Dan Oneață said:

Now, it's good to ponder where the fact that the supply of comonoids is homomorphic is used!

I think that one of the homomorphisms equations is used to show that the product diagram commutes. We have to prove that XΔXXXfgABATBAIAX \xrightarrow{\Delta_X} X \otimes X \xrightarrow{f \otimes g} A \otimes B \xrightarrow{A \otimes T_B} A \otimes I \cong A reduces to XfAX \xrightarrow{f} A. This can be more easily viewed as a string diagram.

PXL_20221027_191534622.jpg

Here, step 1 uses the comonoid homomorphism equation TBg=TXT_B \circ g = T_X, while step 2 uses the comonoid law ΔX(1XTX)λX=1X\Delta_X \circ (1_X \otimes T_X) \circ \lambda_X = 1_X (?).

Yes, that's right!

Still, I'm still not sure where the other homomorphism law is used. Maybe to show the uniqueness?

Exactly, your intuitions are right. Let's assume there is another universal map u:XABu : X \to A \otimes B making the diagram commute, i.e. for which it holds that u;πA=fu ; \pi_A = f and u;πB=gu ; \pi_B = g. We can show this uu is going to be equal to XΔXXXfgABX \xrightarrow{\Delta_X} X \otimes X \xrightarrow{f \otimes g} A \otimes B.

This proof is best understood with string diagrams (IMG_0859.jpeg) The first equality is the comonoid unit law (applied twice, once for AA, once for BB). Second equality is the naturality of the comonoid copy map (the second homomorphism). And third equation is assumption that the diagram commutes (written in the paragraph above).

One final question: do we also need to prove that the unit object II is the terminal object in C\mathcal{C}?

Yes, we do. That should follow from the naturality of the counit of the comonoid. It's interesting that the naturality of the counit is used twice (once here, and once in the proof you wrote)! I didn't notice that before.

view this post on Zulip Bruno Gavranović (Oct 29 2022 at 08:26):

Dan Oneață said:

A somewhat related question; in The Simple Essence of Automatic Differentiation, Conal specifies a Cartesian category using what seems to me a mix of the two approaches discussed above (section 4.3 in the paper):

class Monoidal k => Cartesian k where
    exl :: (a × b) `k` a
    exr :: (a × b) `k` b
    dup :: a `k` (a × a)

So instead of fork :: (a `k` c) × (a `k` d) → (a `k` (c × d)) he is using dup :: a `k` (a × a). Is this yet another way of specifying the same structure? Any ideas why he opts for this approach? Is it more natural to use dup than fork?

By the way, has anyone here read this paper? It's a really nice paper, but I've been left with some questions, so I would really appreciate if someone is willing to share their thoughts with me.

Yes, this seems to be a different way of specifying the same structure. He seems to be specifying a cartesian category in a sort of a mixed way - he uses dup (the comonoid copy operation) and he also uses projections (which are derived from comonoid counit operation). I suppose the answer of what's more natural might be a matter of taste. To me, thinking with comonoids feels more natural. Using things like dup makes it explicit that we're copying some data, and using monoid counit makes it explicit that we're forgetting some data.

view this post on Zulip Bruno Gavranović (Oct 29 2022 at 08:28):

This becomes especially enlightening when we think about differentiating functions! Reverse derivative (i.e. the transpose of the Jacobian) of copy (comonoid comultiplication) is just sum (monoid multiplication), and reverse derivative of delete (comonoid counit) is just the zero (monoid unit)! I drew a neat picture of this a while ago
E88iVvwWUAATjpo.jpeg

view this post on Zulip Bruno Gavranović (Oct 29 2022 at 08:36):

As a matter of fact, it can be shown that a large portion of Conal's paper is secretly talking about operational differences of compositions of various kinds of lenses! He briefly talks about (Cartesian) lenses in Section 3.1 - the kinds of lenses we talked about in this lecture. He correctly identifies that these lenses perform redundant computation, and instead suggests using something called closed (also often confusingly called linear lenses, defined in Categories of Optics, Section 4.8).

view this post on Zulip Bruno Gavranović (Oct 29 2022 at 08:38):

I wrote a recent paper arguing that the identified redundant computation is actually part of a tradeoff, i.e. that lens composition actually implements the gradient checkpointing method of implementing automatic differentiation - trading time for space.

view this post on Zulip Bruno Gavranović (Oct 29 2022 at 08:39):

Thanks for bringing that paper up! It's a really nice paper, and it motivated a lot of my thinking w.r.t. category theory / automatic differentiation.

view this post on Zulip Dan Oneață (Oct 30 2022 at 07:35):

Thanks a lot, Bruno, for the detailed answers and food for thought—much appreciated!

view this post on Zulip Nathaniel Virgo (Oct 30 2022 at 09:22):

Where can I read up on momentum and Nesterov momentum being lenses?

view this post on Zulip Matteo Capucci (he/him) (Oct 30 2022 at 10:09):

_Categorical foundation of gradient-based learning_ mentions momentum methods @Nathaniel Virgo

view this post on Zulip Bruno Gavranović (Oct 30 2022 at 19:00):

Matteo is right. I'll just link Categorical Foundations of Gradient-Based Learning here for easy access.