You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
This is the official topic for the third lecture of the course: "Categorical Dataflow: Optics and Lenses as data structures for backpropagation"
As a reminder, by the end of this week you will:
This lecture will help explain key parts of the paper Categorical Foundations of Gradient-Based Learning (Cruttwell et al., ESOP 2022) and the accompanying blog post.
The zoom link for the lecture is https://uva-live.zoom.us/j/83816139841
Hi all,
I'm very excited for our third lecture of the course today! We'll take a look at monoidal categories, cartesian categories, lenses and optics, and see how these constructions help us understand various computation patterns happening inside neural networks.
Here's some last minute information (also communicated in the Google groups email.)
The lecture starts in little over 3 hours (at 4PM UK time) at the following Zoom link: https://uva-live.zoom.us/j/83816139841 As before, it will be streamed to Youtube (https://www.youtube.com/watch?v=p4iRU4pBkCo) and recorded.
The lecture is expected to last up to 2 hours (including Q&A). There's a lot of ground to cover, and this will give us enough time thoroughly unpack these abstract constructions.
See you soon!
We'll be starting in about 15 minutes! The lecture slides are available here, though many slides have content that I plan to fill in during the lecture. After it's filled in I'll update the slides accordingly.
Is there a mistake on the zoom link, I clicked on it and it says it's scheduled for tomorrow?
The zoom link says meeting ID is expired. Is there a new link?
Nitin Kishore Sai Samala said:
The zoom link says meeting ID is expired. Is there a new link?
Works fine for me! Which link are you using?
The link should be https://uva-live.zoom.us/j/83816139841
My Google calendar invite had this link https://uva-live.zoom.us/j/87524053828
Ieva Cepaite said:
Nitin Kishore Sai Samala said:
The zoom link says meeting ID is expired. Is there a new link?
Works fine for me! Which link are you using?
Nitin Kishore Sai Samala said:
My Google calendar invite had this link https://uva-live.zoom.us/j/87524053828
Ieva Cepaite said:
Nitin Kishore Sai Samala said:
The zoom link says meeting ID is expired. Is there a new link?
Works fine for me! Which link are you using?
Ah okay, the one in the email is different, see Pim's reply above:
Pim de Haan said:
The link should be https://uva-live.zoom.us/j/83816139841
Thanks to everyone for listening and asking questions! I'm sorry I didn't manage to answer all the questions, in the beginning my zoom chat was overloading with new information. I'm happy to answer them here.
I think there was the most confusion about the kind graphical transformations that are allowed for monoidal categories. This is certainly a nuanced point, and for now I'll just share the Categorical Quantum Information lecture slides I referred to in the course here, since I need a proper break after a 2h lecture :sweat_smile:
Bruno Gavranovic said:
Thanks to everyone for listening and asking questions! I'm sorry I didn't manage to answer all the questions, in the beginning my zoom chat was overloading with new information. I'm happy to answer them here.
I think there was the most confusion about the kind graphical transformations that are allowed for monoidal categories. This is certainly a nuanced point, and for now I'll just share the lecture slides here, since I need a proper break after a 2h lecture :sweat_smile:
I'll share both the lecture notes and tutorials/solutions for the course (hopefully no one ends up taking Chris Heuenen's categorical quantum information course here :sweat_smile: ). They're focused on quantum systems and qubits and such, but they have a lot of insights for monoidal categories! Categories.pdf TutorialsSolutions.pdf
Ah, I need to mention the context again that the whole exercise about sliding things around on a plane was taken from this course, so it might be helpful to read and compare/contrast
Concerning the question about a homomorphic supply of monoids
Could we please have access to the populated slides?
Matthew Pugh said:
Concerning the question about a homomorphic supply of monoids
I'm not very familiar with "supply" and I'm waiting on the populated slides to look at the definition, but I'm familiar with a different phrasing of this theorem, that every monoidal category has a "universal cartesianization" given by the category of comonoids, and dually a universal cocartesianization given by the category of monoids. These two statements should be equivalent by dualizing.
And yes, cocartesian categories are less intuitive than cartesian ones when you're used to working with sets and types. Generally cartesian categories are more "geometric" and cocartesian ones are more "algebraic"; for example, the opposite of the category of sets with product is cocartesian, and it's equivalent to the category of complete atomic boolean algebras with tensor product.
Tensor products of algebras provide a whole host of examples. A favorite of mine is the category of commutative rings, where the tensor product is the categorical coproduct. We can also look at abelian categories, e.g. the category of abelian groups with direct sum, which are both cartesian and cocartesian.
I'll cover some of this in my lecture in 2 weeks. I think of monoids in terms of data aggregation, so anything like a concatenation, a running total, or an RNN, should be a good place to look to understand cocartesian categories.
Alexandros Keros said:
Could we please have access to the populated slides?
I'll have them up on the next day or two! I want to clean things up a bit
Matthew Pugh said:
Concerning the question about a homomorphic supply of monoids
That's exactly right! Every object becomes a commutative monoid in a unique way, i.e. it becomes equipped with a unique map "merging" two 's into one and a unique initial map . This is what characterises cocartesian categories (category with a monoidal product given by coproduct): we can always systematically "merge" objects, and "create" objects (though I'm not too happy with this terminology)
Note that the category has a supply of monoids on every object given by and (i.e. it's a monoid given by addition with 0 as the netural element), but interestingly this supply is not homomorphic: given an arbitrary smooth function it's not always the case that , and it's not always the case that . This would imply all our functions are additive, when clearly we have things like which violate this.
So depending on what kinds of things we study, we might encounter various supplies that may or may not be homomorphic.
Andrew Dudzik said:
I'm not very familiar with "supply"
The terminology of "supply" arises from Fong & Spivak's Supplying bells & whistles in symmetric monoidal categories paper.
Here are the populated slides from the last lecture.
I added an extra page unpacking the equational form of the exercise diagrams.
Bruno Gavranovic said:
That's exactly right! Every object becomes a comonoid in a unique way, i.e. it becomes equipped with a unique map "merging" two 's into one and a unique initial map . This is what characterises cocartesian categories (category with a monoidal product given by coproduct): we can always systematically "merge" objects, and "create" objects (though I'm not too happy with this terminology)
Would you mind expanding a bit more on your answer? I think your answer treats only one direction of the isomorphism: given a cocartesian monoidal category, we obtain a (homomorphic) supply of monoids. But if we wanted to show the other direction, how should one proceed? As far as I understand (and correct me if I'm wrong), we would make the following assumptions:
Given these ingredients, I guess we need to show that is the coproduct and is the initial object? What would be a way of doing that? By checking the universal properties of the two objects?
One more minor note; you said:
Every object becomes a comonoid in a unique way
Should it have been "monoid" instead "comonoid"?
Dan Oneață said:
Would you mind expanding a bit more on your answer? I think your answer treats only one direction of the isomorphism: given a cocartesian monoidal category, we obtain a (homomorphic) supply of monoids. But if we wanted to show the other direction, how should one proceed? As far as I understand (and correct me if I'm wrong), we would make the following assumptions:
- We are given a symmetric monoidal category .
- Each object is a monoid, that is, we have two morphisms and .
- Every morphism is a monoid homomorhphism, that is, and .
Given these ingredients, I guess we need to show that is the coproduct and is the initial object? What would be a way of doing that? By checking the universal properties of the two objects?
Excellent question! Yeah you can go the other way around, and your proof strategy (showing the universal property hold) sounds good.
I suggest first looking at how the universal property of coproducts gives rise to the monoid structure on any given object. Then reverse-engineering that should be easier.
Dan Oneață said:
One more minor note; you said:
Every object becomes a comonoid in a unique way
Should it have been "monoid" instead "comonoid"?
Yeah I think that was just a typo (it's a common slip of the tongue to say 'comonoid' when you mean 'commutative monoid', I do that all the time unfortunately)
Indeed, that was a typo, thanks!
Dan Oneață said:
Would mind expanding a bit more on your answer?
Happy to elaborate.
Dan Oneață said:
Given these ingredients, I guess we need to show that is the coproduct and is the initial object? What would be a way of doing that? By checking the universal properties of the two objects?
I'll tell you how to do that for the cartesian case (i.e. proving that a symmetric monoidal category with a homomorphic supply of comonoids is cartesian), and you can try to dualise it. Basically the idea is to use the copy map of the comonoid in the universal morphism definition, and the counit map in the projections. But let me take it step by step.
We want to prove that for every pair of objects and the object comes with two projection maps and such that for any other object and maps and there is a unique map of type such that the appropriate diagram (the one drawn in the lecture) commutes.
The proof involves two parts. The first one is to show that the maps and exist. This uses the natural counit map of the comonoid . Then we can form the composite which will be our . Similar strategy follows for .
We now need to show that there is a unique factorization as described above, and this is where we use the copy of the comonoid. The idea is: we have two maps and , i.e. we want to use twice! So to produce the map , we first copy the and then apply individually and on components. I.e. we define it as a composite . We then need to show the diagrams commute, and that the map we defined is unique. For the latter, the kind of proof we usually want is to assume that if there's any other map satisfying the commuting diagram property, it has to be equal to the map we defined.
Now, it's good to ponder where the fact that the supply of comonoids is homomorphic is used!
Thanks a lot, Bruno and Matteo ! That's very clear! I've also made progress on the dual statement, once Matteo confirmed that I'm on the right path.
Now, it's good to ponder where the fact that the supply of comonoids is homomorphic is used!
I think that one of the homomorphisms equations is used to show that the product diagram commutes. We have to prove that reduces to . This can be more easily viewed as a string diagram.
Here, step 1 uses the comonoid homomorphism equation , while step 2 uses the comonoid law (?).
Still, I'm still not sure where the other homomorphism law is used. Maybe to show the uniqueness?
One final question: do we also need to prove that the unit object is the terminal object in ?
A somewhat related question; in The Simple Essence of Automatic Differentiation, Conal specifies a Cartesian category using what seems to me a mix of the two approaches discussed above (section 4.3 in the paper):
class Monoidal k => Cartesian k where
exl :: (a × b) `k` a
exr :: (a × b) `k` b
dup :: a `k` (a × a)
So instead of fork :: (a `k` c) × (a `k` d) → (a `k` (c × d))
he is using dup :: a `k` (a × a)
. Is this yet another way of specifying the same structure? Any ideas why he opts for this approach? Is it more natural to use dup
than fork
?
By the way, has anyone here read this paper? It's a really nice paper, but I've been left with some questions, so I would really appreciate if someone is willing to share their thoughts with me.
Dan Oneață said:
Now, it's good to ponder where the fact that the supply of comonoids is homomorphic is used!
I think that one of the homomorphisms equations is used to show that the product diagram commutes. We have to prove that reduces to . This can be more easily viewed as a string diagram.
Here, step 1 uses the comonoid homomorphism equation , while step 2 uses the comonoid law (?).
Yes, that's right!
Still, I'm still not sure where the other homomorphism law is used. Maybe to show the uniqueness?
Exactly, your intuitions are right. Let's assume there is another universal map making the diagram commute, i.e. for which it holds that and . We can show this is going to be equal to .
This proof is best understood with string diagrams (IMG_0859.jpeg) The first equality is the comonoid unit law (applied twice, once for , once for ). Second equality is the naturality of the comonoid copy map (the second homomorphism). And third equation is assumption that the diagram commutes (written in the paragraph above).
One final question: do we also need to prove that the unit object is the terminal object in ?
Yes, we do. That should follow from the naturality of the counit of the comonoid. It's interesting that the naturality of the counit is used twice (once here, and once in the proof you wrote)! I didn't notice that before.
Dan Oneață said:
A somewhat related question; in The Simple Essence of Automatic Differentiation, Conal specifies a Cartesian category using what seems to me a mix of the two approaches discussed above (section 4.3 in the paper):
class Monoidal k => Cartesian k where exl :: (a × b) `k` a exr :: (a × b) `k` b dup :: a `k` (a × a)
So instead of
fork :: (a `k` c) × (a `k` d) → (a `k` (c × d))
he is usingdup :: a `k` (a × a)
. Is this yet another way of specifying the same structure? Any ideas why he opts for this approach? Is it more natural to usedup
thanfork
?By the way, has anyone here read this paper? It's a really nice paper, but I've been left with some questions, so I would really appreciate if someone is willing to share their thoughts with me.
Yes, this seems to be a different way of specifying the same structure. He seems to be specifying a cartesian category in a sort of a mixed way - he uses dup (the comonoid copy operation) and he also uses projections (which are derived from comonoid counit operation). I suppose the answer of what's more natural might be a matter of taste. To me, thinking with comonoids feels more natural. Using things like dup
makes it explicit that we're copying some data, and using monoid counit makes it explicit that we're forgetting some data.
This becomes especially enlightening when we think about differentiating functions! Reverse derivative (i.e. the transpose of the Jacobian) of copy (comonoid comultiplication) is just sum (monoid multiplication), and reverse derivative of delete (comonoid counit) is just the zero (monoid unit)! I drew a neat picture of this a while ago
E88iVvwWUAATjpo.jpeg
As a matter of fact, it can be shown that a large portion of Conal's paper is secretly talking about operational differences of compositions of various kinds of lenses! He briefly talks about (Cartesian) lenses in Section 3.1 - the kinds of lenses we talked about in this lecture. He correctly identifies that these lenses perform redundant computation, and instead suggests using something called closed (also often confusingly called linear lenses, defined in Categories of Optics, Section 4.8).
I wrote a recent paper arguing that the identified redundant computation is actually part of a tradeoff, i.e. that lens composition actually implements the gradient checkpointing method of implementing automatic differentiation - trading time for space.
Thanks for bringing that paper up! It's a really nice paper, and it motivated a lot of my thinking w.r.t. category theory / automatic differentiation.
Thanks a lot, Bruno, for the detailed answers and food for thought—much appreciated!
Where can I read up on momentum and Nesterov momentum being lenses?
_Categorical foundation of gradient-based learning_ mentions momentum methods @Nathaniel Virgo
Matteo is right. I'll just link Categorical Foundations of Gradient-Based Learning here for easy access.