Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: event: Categories for AI

Topic: Week 3: Categorical Dataflow: Lenses and Optics

Bruno Gavranović (Oct 20 2022 at 09:36):

This is the official topic for the third lecture of the course: "Categorical Dataflow: Optics and Lenses as data structures for backpropagation"

As a reminder, by the end of this week you will:

Understand the difference between a monoidal and a cartesian category
Get comfortable using their formal graphical language: string diagrams
Learn about lenses and optics, abstract interfaces for modelling bidirectional data flow
See examples of lenses and optics modelling backpropagation, gradient descent, value iteration, and more
Understand how the chain rule is a special case of lens composition

This lecture will help explain key parts of the paper Categorical Foundations of Gradient-Based Learning (Cruttwell et al., ESOP 2022) and the accompanying blog post.

Pim de Haan (Oct 20 2022 at 09:44):

The zoom link for the lecture is https://uva-live.zoom.us/j/83816139841

Bruno Gavranović (Oct 24 2022 at 11:49):

Hi all,

I'm very excited for our third lecture of the course today! We'll take a look at monoidal categories, cartesian categories, lenses and optics, and see how these constructions help us understand various computation patterns happening inside neural networks.

Here's some last minute information (also communicated in the Google groups email.)

The lecture starts in little over 3 hours (at 4PM UK time) at the following Zoom link: https://uva-live.zoom.us/j/83816139841 As before, it will be streamed to Youtube (https://www.youtube.com/watch?v=p4iRU4pBkCo) and recorded.

The lecture is expected to last up to 2 hours (including Q&A). There's a lot of ground to cover, and this will give us enough time thoroughly unpack these abstract constructions.

See you soon!

Bruno Gavranović (Oct 24 2022 at 14:43):

We'll be starting in about 15 minutes! The lecture slides are available here, though many slides have content that I plan to fill in during the lecture. After it's filled in I'll update the slides accordingly.

Sichu Lu (Oct 24 2022 at 14:51):

Is there a mistake on the zoom link, I clicked on it and it says it's scheduled for tomorrow?

Nitin Kishore Sai Samala (Oct 24 2022 at 15:02):

The zoom link says meeting ID is expired. Is there a new link?

Ieva Cepaite (Oct 24 2022 at 15:05):

Nitin Kishore Sai Samala said:

The zoom link says meeting ID is expired. Is there a new link?

Works fine for me! Which link are you using?

Pim de Haan (Oct 24 2022 at 15:05):

The link should be https://uva-live.zoom.us/j/83816139841

Nitin Kishore Sai Samala (Oct 24 2022 at 15:08):

My Google calendar invite had this link https://uva-live.zoom.us/j/87524053828

Ieva Cepaite said:

Nitin Kishore Sai Samala said:

The zoom link says meeting ID is expired. Is there a new link?

Works fine for me! Which link are you using?

Ieva Cepaite (Oct 24 2022 at 15:09):

Nitin Kishore Sai Samala said:

My Google calendar invite had this link https://uva-live.zoom.us/j/87524053828

Ieva Cepaite said:

Nitin Kishore Sai Samala said:

The zoom link says meeting ID is expired. Is there a new link?

Works fine for me! Which link are you using?

Ah okay, the one in the email is different, see Pim's reply above:

Pim de Haan said:

The link should be https://uva-live.zoom.us/j/83816139841

Bruno Gavranović (Oct 24 2022 at 17:14):

Thanks to everyone for listening and asking questions! I'm sorry I didn't manage to answer all the questions, in the beginning my zoom chat was overloading with new information. I'm happy to answer them here.

I think there was the most confusion about the kind graphical transformations that are allowed for monoidal categories. This is certainly a nuanced point, and for now I'll just share the Categorical Quantum Information lecture slides I referred to in the course here, since I need a proper break after a 2h lecture :sweat_smile:

Ieva Cepaite (Oct 24 2022 at 17:17):

Bruno Gavranovic said:

Thanks to everyone for listening and asking questions! I'm sorry I didn't manage to answer all the questions, in the beginning my zoom chat was overloading with new information. I'm happy to answer them here.

I think there was the most confusion about the kind graphical transformations that are allowed for monoidal categories. This is certainly a nuanced point, and for now I'll just share the lecture slides here, since I need a proper break after a 2h lecture :sweat_smile:

I'll share both the lecture notes and tutorials/solutions for the course (hopefully no one ends up taking Chris Heuenen's categorical quantum information course here :sweat_smile: ). They're focused on quantum systems and qubits and such, but they have a lot of insights for monoidal categories! Categories.pdf TutorialsSolutions.pdf

Ieva Cepaite (Oct 24 2022 at 17:20):

Ah, I need to mention the context again that the whole exercise about sliding things around on a plane was taken from this course, so it might be helpful to read and compare/contrast

Matthew Pugh (Oct 24 2022 at 20:39):

Concerning the question about a homomorphic supply of monoids

Alexandros Keros (Oct 24 2022 at 23:11):

Could we please have access to the populated slides?

Andrew Dudzik (Oct 25 2022 at 10:08):

Matthew Pugh said:

Concerning the question about a homomorphic supply of monoids

I'm not very familiar with "supply" and I'm waiting on the populated slides to look at the definition, but I'm familiar with a different phrasing of this theorem, that every monoidal category has a "universal cartesianization" given by the category of comonoids, and dually a universal cocartesianization given by the category of monoids. These two statements should be equivalent by dualizing.

And yes, cocartesian categories are less intuitive than cartesian ones when you're used to working with sets and types. Generally cartesian categories are more "geometric" and cocartesian ones are more "algebraic"; for example, the opposite of the category of sets with product is cocartesian, and it's equivalent to the category of complete atomic boolean algebras with tensor product.

Tensor products of algebras provide a whole host of examples. A favorite of mine is the category of commutative rings, where the tensor product is the categorical coproduct. We can also look at abelian categories, e.g. the category of abelian groups with direct sum, which are both cartesian and cocartesian.

I'll cover some of this in my lecture in 2 weeks. I think of monoids in terms of data aggregation, so anything like a concatenation, a running total, or an RNN, should be a good place to look to understand cocartesian categories.

Bruno Gavranović (Oct 25 2022 at 11:57):

Alexandros Keros said:

Could we please have access to the populated slides?

I'll have them up on the next day or two! I want to clean things up a bit

Bruno Gavranović (Oct 25 2022 at 12:05):

Matthew Pugh said:

Concerning the question about a homomorphic supply of monoids

That's exactly right! Every object $A : \mathcal{C}$ becomes a commutative monoid in a unique way, i.e. it becomes equipped with a unique map $\nabla_A : A + A \to A$ "merging" two $A$ 's into one and a unique initial map $0 \to A$ . This is what characterises cocartesian categories (category with a monoidal product given by coproduct): we can always systematically "merge" objects, and "create" objects (though I'm not too happy with this terminology)

Bruno Gavranović (Oct 25 2022 at 12:12):

Note that the category $(\mathbf{Euc}, \times, 1)$ has a supply of monoids on every object given by $+$ and $0$ (i.e. it's a monoid given by addition with 0 as the netural element), but interestingly this supply is not homomorphic: given an arbitrary smooth function $f : \mathbb{R} \to \mathbb{R}$ it's not always the case that $f(x + x) = f(x) + f(x)$ , and it's not always the case that $f(0) = 0$ . This would imply all our functions are additive, when clearly we have things like $f(x)=x^2$ which violate this.

Bruno Gavranović (Oct 25 2022 at 12:13):

So depending on what kinds of things we study, we might encounter various supplies that may or may not be homomorphic.

Bruno Gavranović (Oct 25 2022 at 12:14):

Andrew Dudzik said:

I'm not very familiar with "supply"

The terminology of "supply" arises from Fong & Spivak's Supplying bells & whistles in symmetric monoidal categories paper.

Bruno Gavranović (Oct 26 2022 at 09:26):

Here are the populated slides from the last lecture.
I added an extra page unpacking the equational form of the exercise diagrams.

Dan Oneață (Oct 27 2022 at 10:52):

Bruno Gavranovic said:

That's exactly right! Every object $A : \mathcal{C}$ becomes a comonoid in a unique way, i.e. it becomes equipped with a unique map $\nabla_A : A + A \to A$ "merging" two $A$ 's into one and a unique initial map $0 \to A$ . This is what characterises cocartesian categories (category with a monoidal product given by coproduct): we can always systematically "merge" objects, and "create" objects (though I'm not too happy with this terminology)

Would you mind expanding a bit more on your answer? I think your answer treats only one direction of the isomorphism: given a cocartesian monoidal category, we obtain a (homomorphic) supply of monoids. But if we wanted to show the other direction, how should one proceed? As far as I understand (and correct me if I'm wrong), we would make the following assumptions:

We are given a symmetric monoidal category $\langle \mathcal{C}, \otimes, I \rangle$ .
Each object $A \in \mathcal{C}$ is a monoid, that is, we have two morphisms $\mu_A : A \otimes A \to A$ and $\eta_A : I \to A$ .
Every morphism $f : A \to A'$ is a monoid homomorhphism, that is, $f \circ \mu_A = \mu_{A'} \circ (f \otimes f)$ and $f \circ \eta_A = \eta_{A'}$ .

Given these ingredients, I guess we need to show that $\otimes$ is the coproduct and $I$ is the initial object? What would be a way of doing that? By checking the universal properties of the two objects?

One more minor note; you said:

Every object $A : \mathcal{C}$ becomes a comonoid in a unique way

Should it have been "monoid" instead "comonoid"?

Matteo Capucci (he/him) (Oct 27 2022 at 11:14):

Dan Oneață said:

Would you mind expanding a bit more on your answer? I think your answer treats only one direction of the isomorphism: given a cocartesian monoidal category, we obtain a (homomorphic) supply of monoids. But if we wanted to show the other direction, how should one proceed? As far as I understand (and correct me if I'm wrong), we would make the following assumptions:

We are given a symmetric monoidal category $\langle \mathcal{C}, \otimes, I \rangle$ .

Each object $A \in \mathcal{C}$ is a monoid, that is, we have two morphisms $\mu_A : A \otimes A \to A$ and $\eta_A : I \to A$ .

Every morphism $f : A \to A'$ is a monoid homomorhphism, that is, $f \circ \mu_A = \mu_{A'} \circ (f \otimes f)$ and $f \circ \eta_A = \eta_{A'}$ .

Given these ingredients, I guess we need to show that $\otimes$ is the coproduct and $I$ is the initial object? What would be a way of doing that? By checking the universal properties of the two objects?

Excellent question! Yeah you can go the other way around, and your proof strategy (showing the universal property hold) sounds good.

Matteo Capucci (he/him) (Oct 27 2022 at 11:15):

I suggest first looking at how the universal property of coproducts gives rise to the monoid structure on any given object. Then reverse-engineering that should be easier.

Matteo Capucci (he/him) (Oct 27 2022 at 11:16):

Dan Oneață said:

One more minor note; you said:

Every object $A : \mathcal{C}$ becomes a comonoid in a unique way

Should it have been "monoid" instead "comonoid"?

Yeah I think that was just a typo (it's a common slip of the tongue to say 'comonoid' when you mean 'commutative monoid', I do that all the time unfortunately)

Bruno Gavranović (Oct 27 2022 at 12:58):

Indeed, that was a typo, thanks!

Bruno Gavranović (Oct 27 2022 at 13:14):

Dan Oneață said:

Would mind expanding a bit more on your answer?

Happy to elaborate.

Dan Oneață said:

Given these ingredients, I guess we need to show that $\otimes$ is the coproduct and $I$ is the initial object? What would be a way of doing that? By checking the universal properties of the two objects?

I'll tell you how to do that for the cartesian case (i.e. proving that a symmetric monoidal category with a homomorphic supply of comonoids is cartesian), and you can try to dualise it. Basically the idea is to use the copy map of the comonoid in the universal morphism definition, and the counit map in the projections. But let me take it step by step.

We want to prove that for every pair of objects $A$ and $B$ the object $A \otimes B$ comes with two projection maps $\pi_A : A \otimes B \to A$ and $\pi_B: A \otimes B \to B$ such that for any other object $X$ and maps $f : X \to A$ and $g : X \to B$ there is a unique map of type $X \to A \otimes B$ such that the appropriate diagram (the one drawn in the lecture) commutes.

The proof involves two parts. The first one is to show that the maps $\pi_A$ and $\pi_B$ exist. This uses the natural counit map of the comonoid $T_A : A \to I$ . Then we can form the composite $A \otimes B \xrightarrow{T_A \otimes B} I \otimes A \cong A$ which will be our $\pi_A$ . Similar strategy follows for $\pi_B$ .

We now need to show that there is a unique factorization as described above, and this is where we use the copy of the comonoid. The idea is: we have two maps $f : X \to A$ and $g: X \to B$ , i.e. we want to use $X$ twice! So to produce the map $X \to A \otimes B$ , we first copy the $X$ and then apply individually $f$ and $g$ on components. I.e. we define it as a composite $X \xrightarrow{\Delta_X} X \otimes X \xrightarrow{f \otimes g} A \otimes B$ . We then need to show the diagrams commute, and that the map we defined is unique. For the latter, the kind of proof we usually want is to assume that if there's any other map $X \to A \otimes B$ satisfying the commuting diagram property, it has to be equal to the map we defined.

Bruno Gavranović (Oct 27 2022 at 13:23):

Now, it's good to ponder where the fact that the supply of comonoids is homomorphic is used!

Dan Oneață (Oct 27 2022 at 19:47):

Thanks a lot, Bruno and Matteo ! That's very clear! I've also made progress on the dual statement, once Matteo confirmed that I'm on the right path.

Now, it's good to ponder where the fact that the supply of comonoids is homomorphic is used!

I think that one of the homomorphisms equations is used to show that the product diagram commutes. We have to prove that $X \xrightarrow{\Delta_X} X \otimes X \xrightarrow{f \otimes g} A \otimes B \xrightarrow{A \otimes T_B} A \otimes I \cong A$ reduces to $X \xrightarrow{f} A$ . This can be more easily viewed as a string diagram.

PXL_20221027_191534622.jpg

Here, step 1 uses the comonoid homomorphism equation $T_B \circ g = T_X$ , while step 2 uses the comonoid law $\Delta_X \circ (1_X \otimes T_X) \circ \lambda_X = 1_X$ (?).

Still, I'm still not sure where the other homomorphism law is used. Maybe to show the uniqueness?

One final question: do we also need to prove that the unit object $I$ is the terminal object in $\mathcal{C}$ ?

Dan Oneață (Oct 27 2022 at 20:16):

A somewhat related question; in The Simple Essence of Automatic Differentiation, Conal specifies a Cartesian category using what seems to me a mix of the two approaches discussed above (section 4.3 in the paper):

class Monoidal k => Cartesian k where
    exl :: (a × b) `k` a
    exr :: (a × b) `k` b
    dup :: a `k` (a × a)

So instead of fork :: (a `k` c) × (a `k` d) → (a `k` (c × d)) he is using dup :: a `k` (a × a). Is this yet another way of specifying the same structure? Any ideas why he opts for this approach? Is it more natural to use dup than fork?

By the way, has anyone here read this paper? It's a really nice paper, but I've been left with some questions, so I would really appreciate if someone is willing to share their thoughts with me.

Bruno Gavranović (Oct 29 2022 at 08:02):

Dan Oneață said:

Now, it's good to ponder where the fact that the supply of comonoids is homomorphic is used!

I think that one of the homomorphisms equations is used to show that the product diagram commutes. We have to prove that $X \xrightarrow{\Delta_X} X \otimes X \xrightarrow{f \otimes g} A \otimes B \xrightarrow{A \otimes T_B} A \otimes I \cong A$ reduces to $X \xrightarrow{f} A$ . This can be more easily viewed as a string diagram.

PXL_20221027_191534622.jpg

Here, step 1 uses the comonoid homomorphism equation $T_B \circ g = T_X$ , while step 2 uses the comonoid law $\Delta_X \circ (1_X \otimes T_X) \circ \lambda_X = 1_X$ (?).

Yes, that's right!

Still, I'm still not sure where the other homomorphism law is used. Maybe to show the uniqueness?

Exactly, your intuitions are right. Let's assume there is another universal map $u : X \to A \otimes B$ making the diagram commute, i.e. for which it holds that $u ; \pi_A = f$ and $u ; \pi_B = g$ . We can show this $u$ is going to be equal to $X \xrightarrow{\Delta_X} X \otimes X \xrightarrow{f \otimes g} A \otimes B$ .

This proof is best understood with string diagrams (IMG_0859.jpeg) The first equality is the comonoid unit law (applied twice, once for $A$ , once for $B$ ). Second equality is the naturality of the comonoid copy map (the second homomorphism). And third equation is assumption that the diagram commutes (written in the paragraph above).

One final question: do we also need to prove that the unit object $I$ is the terminal object in $\mathcal{C}$ ?

Yes, we do. That should follow from the naturality of the counit of the comonoid. It's interesting that the naturality of the counit is used twice (once here, and once in the proof you wrote)! I didn't notice that before.

Bruno Gavranović (Oct 29 2022 at 08:26):

Dan Oneață said:

A somewhat related question; in The Simple Essence of Automatic Differentiation, Conal specifies a Cartesian category using what seems to me a mix of the two approaches discussed above (section 4.3 in the paper):
class Monoidal k => Cartesian k where
    exl :: (a × b) `k` a
    exr :: (a × b) `k` b
    dup :: a `k` (a × a)
So instead of fork :: (a `k` c) × (a `k` d) → (a `k` (c × d)) he is using dup :: a `k` (a × a). Is this yet another way of specifying the same structure? Any ideas why he opts for this approach? Is it more natural to use dup than fork?

By the way, has anyone here read this paper? It's a really nice paper, but I've been left with some questions, so I would really appreciate if someone is willing to share their thoughts with me.

Yes, this seems to be a different way of specifying the same structure. He seems to be specifying a cartesian category in a sort of a mixed way - he uses dup (the comonoid copy operation) and he also uses projections (which are derived from comonoid counit operation). I suppose the answer of what's more natural might be a matter of taste. To me, thinking with comonoids feels more natural. Using things like dup makes it explicit that we're copying some data, and using monoid counit makes it explicit that we're forgetting some data.

Bruno Gavranović (Oct 29 2022 at 08:28):

This becomes especially enlightening when we think about differentiating functions! Reverse derivative (i.e. the transpose of the Jacobian) of copy (comonoid comultiplication) is just sum (monoid multiplication), and reverse derivative of delete (comonoid counit) is just the zero (monoid unit)! I drew a neat picture of this a while ago
E88iVvwWUAATjpo.jpeg

Bruno Gavranović (Oct 29 2022 at 08:36):

As a matter of fact, it can be shown that a large portion of Conal's paper is secretly talking about operational differences of compositions of various kinds of lenses! He briefly talks about (Cartesian) lenses in Section 3.1 - the kinds of lenses we talked about in this lecture. He correctly identifies that these lenses perform redundant computation, and instead suggests using something called closed (also often confusingly called linear lenses, defined in Categories of Optics, Section 4.8).

Bruno Gavranović (Oct 29 2022 at 08:38):

I wrote a recent paper arguing that the identified redundant computation is actually part of a tradeoff, i.e. that lens composition actually implements the gradient checkpointing method of implementing automatic differentiation - trading time for space.

Bruno Gavranović (Oct 29 2022 at 08:39):

Thanks for bringing that paper up! It's a really nice paper, and it motivated a lot of my thinking w.r.t. category theory / automatic differentiation.

Dan Oneață (Oct 30 2022 at 07:35):

Thanks a lot, Bruno, for the detailed answers and food for thought—much appreciated!

Nathaniel Virgo (Oct 30 2022 at 09:22):

Where can I read up on momentum and Nesterov momentum being lenses?

Matteo Capucci (he/him) (Oct 30 2022 at 10:09):

_Categorical foundation of gradient-based learning_ mentions momentum methods @Nathaniel Virgo

Bruno Gavranović (Oct 30 2022 at 19:00):

Matteo is right. I'll just link Categorical Foundations of Gradient-Based Learning here for easy access.