Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: event: Categories for AI

Topic: Week 2: Essential building blocks: Categories and Functors

Petar Veličković (Oct 10 2022 at 16:17):

This is the official topic for the course's second lecture, "Essential building blocks: Categories and Functors"

As a reminder, by the end of this week you will:

Understand the key building blocks of categories: objects, morphisms and functors.
Leverage these concepts to explain several standard mathematical constructs: sets, relations, and groups.
Get comfortable manipulating these concepts through several worked exercises.
Ground all of the above in relevant deep learning context, with links to functional programming.
Show how we can build an effective "type checker" for deep learning using the category of sets.

This lecture will help explain key parts of Graph Neural Networks are Dynamic Programmers (Dudzik and Veličković, NeurIPS 2022)

Bruno Gavranović (Oct 11 2022 at 21:59):

Hi, I just wanted to mention that we've updated the calendar invite for this week with the new zoom link (you should be able to find it under 'Location' in the event). Alternatively, you should be able to access it from this link: Zoom link for Monday

Petar Veličković (Oct 11 2022 at 22:57):

Hi all,

Another important point to note: Lecture 2 will start at the same time as Lecture 1 (Monday, 17 October), but unlike Bruno's lecture, it will last for up to two hours (including Q&A). Hence, the event will last 4--6pm UK Time.

This is to give us ample time to play with many interesting categorical constructs in a paced manner. If you cannot make the whole slot, this is okay -- we will again strive to record. Of course, being there for the live event is preferable, as it will allow you to ask questions on the spot. :smile:

Petar Veličković (Oct 17 2022 at 12:56):

Hi all!

Super excited about our lecture today!
As a reminder, the lecture is set for 4--6pm, UK Time.

The zoom link for joining the meeting is: https://uva-live.zoom.us/j/82982934369
As before, we aim to livestream on YouTube, which will be available here: https://youtu.be/jU7KyZn_hBc
(this link should also hold the lecture recording once the event is over!)

As promised, I make the slides available ahead-of-time, so annotation can be done on the fly. Thanks for suggesting this @Charles London!
Our slides for today are given here: https://docs.google.com/presentation/d/1z8QmCWsImykggqrHt6pGoQ2pW1_qfHwdX1BnqqpnCdM/edit?usp=sharing

It's highly recommended to join via Zoom if you can -- there will be plentiful potential for interaction during the lecture.

Join me for an exciting journey into the realm categorical.... :dragon: followed with some immediate implications to deep learning architecture design (we'll apply what we've learnt to explain the key concepts of my paper with @Andrew Dudzik -- freshly accepted at NeurIPS'22.

Lucia (Oct 17 2022 at 15:31):

Where is people answering the questions online? Thanks!

Tali Beynon (Oct 17 2022 at 16:39):

Hi! Would someone be kind enough to CTRL+A, CTRL+C, CTRL+V for the contents of the Q&A? my zoom crashed and i lost all previous comments, which i wanted to preserve for some of the links and references.

Rubén Ballester (Oct 17 2022 at 16:41):

Viliam Vadocz 05:09 PM
Why not use the zoom chat instead of the Q&A?
Pim de Haan 05:10 PM
I think Q&A is preferable. Allows for upvotes
Karthikeyan Natesan Ramamurthy 05:18 PM
If we are not allowed to peek into the objects, how do we define useful morphisms?
This question has been answered live
Filip Bar 05:21 PM
Of course we first have to be aware of what our objects are to define meaningful morphisms (and that is the crux when d efining your category). Once you have that data settled down it is the morphisms that allow you to have the bird's eye view on your global (and also local) structure.
Yivan Zhang 05:24 PM
I think this is rooted in the https://ncatlab.org/nlab/show/Yoneda+lemma
ashwath 05:21 PM
does g circle f need to be unique?
This question has been answered live
jules tsukahara 05:21 PM
If g and f are invertible, is it a category?
This question has been answered live
Filip Bar 05:24 PM
Categories where every arrow is invertible is called a groupoid. As the name suggests it is a generalisation of a group, and has been around before the concept of a category has been defined.
Giacomo Aldegheri 05:22 PM
Does category theory also deal with non-categories, or is the concept general enough to be “exhaustive”?
This question has been answered live
Filip Bar 05:22 PM
Fun fact: you can construct a category over any graph
Tali Beynon 05:23 PM
@Filip aka the free category generated by a quiver
jules tsukahara 05:23 PM
i guess you need to add all self loops and make all arrow composable?
Pim de Haan 05:24 PM
For example, a semigroup lacks an identity and itself can not be seen as a category. However, you can often still define morphisms between two semi-groups, so can form a category of all semigroups.
Matthew Pugh 05:24 PM
is every category a subcategory of the free category on some quiver?
Tali Beynon 05:26 PM
@Matthew yes, with some asterisks. for example you need to define "sub" to mean a quotient operation where you declare some morphisms to be identical to others, rather than a sub-graph.
Tali Beynon 05:26 PM
this is similar (in fact almost identical!) to how any finite group can be seen as isomorphic to a free group quotiented by some relations
Filip Bar 05:27 PM
Yes, it should be. You can forget the category structure and just look at it as a graph. Then you can construct the "free" category over that graph. This doesn't make your original category a subcategory. But you can perform a suitable "quotient construction" as Tali says.
Matthew Pugh 05:28 PM
So not as a subcategory but as a 'Quotient' Category by adding relations? Awesome, thanks for the help
Tali Beynon 05:28 PM
@Jules it's more that the morphisms of the free category represent paths of any finite length in the original graph. you can compose paths when they are head to tail. for example, the morphisms between objects 𝑎 and 𝑏 in the free category on a graph is the set of paths that begin at 𝑎 and end at 𝑏
Tali Beynon 05:29 PM
@Jules then the "empty path", which immediately starts and ends at a vertex 𝑎, is exactly the (unique) identity morphism in the category.
Tali Beynon 05:30 PM
@Jules I don't refer to category theory at all but the way it works is visualized here (https://quivergeometry.net/path-groupoids/) and its the same idea
Franck Albinet 05:27 PM
There is a unique one!
This question has been answered live
Dobrik Georgiev 05:27 PM
Constant fn?
This question has been answered live
Victoria Klein 05:27 PM
constant?
This question has been answered live
Dobrik Georgiev 05:29 PM
What does "up to isomorphism" mean in Cat theory? (i.e. what's an Isomorphism)
João Araújo 05:29 PM
Petar will explain that in a couple of slides
Amina Mollaysa 05:29 PM
isn’t what is in the object makes A different from B? if we ignore what is in the object then everything in a set become the smae? like a molecular graph has different atoms because what is inside atoms actually defines what type of atom it is?
João Araújo 05:34 PM
In our context the sets themselves are the objects, what Petar is saying is that we won’t talk about the elements of the sets, but the sets themselves differ from each other by things like cardinality
Filip Bar 05:36 PM
Good point. This is why we need to carefully distinguish between being isomorphic and being equal. Isomers in Chemistry would be isomorphic (in a suitable sense), but not equal.
Tali Beynon 05:53 PM
@Filip makes me think that a nice illustration of the maxim that "objects are defined by their relationships" for chemistry would be that nuclear isotopes manifest almost no chemical difference (except for slight changes in diffusion and reaction rates due to mass and maybe nuclear spin). so their chemical relationships with other molecules are the same, and we can treat them as the same kind of element "most of them time" because chemistry (=category theory) won't be able to tell the difference between them.
yobibyte 05:30 PM
There is |B| morphisms like that?
This question has been answered live
luchino_prince 05:30 PM
cardinality of B
This question has been answered live
Piotr Piękos 05:30 PM
could you give an example of a non-category that doesn’t satisfy the compositionality axiom?
This question has been answered live
Daniel Gonzalez Cedre 05:31 PM
you can imagine a category with three objects A, B, C and two morphisms A->B and B->C and all of the identities, but no morphism A->C
Sebastian Thomas (dida) 05:32 PM
That’s kind of the „wrong“ question as composition is considered to be part of the data.
Piotr Piękos 05:33 PM
but isn’t it the case that you can always easily extend the universe of morphisms to contain the compositions?
Daniel Gonzalez Cedre 05:34 PM
sure, but then you may be dealing with a different universe. for example, if you are trying to describe some phenomenon using categorical language and you discover that compositionality is not satisfied, you can “add it in yourself,” but you have to acknowledge that the underlying phenomenon was not actually categorical
Daniel Gonzalez Cedre 05:34 PM
perhaps someone else can weigh in with such a real-world example, i can’t think of any at the moment
Sebastian Thomas (dida) 05:35 PM
Yes, you have the right intuition. Given a graph, you can „extend“ this graph and get a category. That construction is called the „free category“ on the given graph.

The true problem is that this free category becomes HUGE.
Piotr Piękos 05:36 PM
regardless of the real world example, i think i see it now more clearly, thanks. (If anyone has a non-trivial example, then I would still appreciate it)
Filip Bar 05:38 PM
@Piotr. Take as objects the points in space (say a torus = doughnuts), and the arrows to be be paths between the points. This does not form a category unless you are very careful how you model continuous paths mathematically.
Brayan Escobar 05:41 PM
e.g. the graph of lovers, the objects are people, and a connection between A and B means "A fells attraction to B" or "A likes B"
Shivam 05:31 PM
Descreibes ‘inside of B’ in a sense?
This question has been answered live
Andrei Manolache 05:32 PM
Don’t you still need to have some information about the set? In this case the cardinality?
Pim de Haan 05:32 PM
Just the set morphisms into and out of the object encode the cardinality.
Andrei Manolache 05:37 PM
I see, but we still need to know something about the (unique) morphisms, isn’t that equivalent to knowing the cardinality of the set?
05:33 PM
05:34 PM
05:34 PM
05:35 PM
05:36 P
Tali Beynon said:

Hi! Would someone be kind enough to CTRL+A, CTRL+C, CTRL+V for the contents of the Q&A? my zoom crashed and i lost all previous comments, which i wanted to preserve for some of the links and references.

M
05:37 PM
05:38 PM
05:41 PM
05:41 PM
05:43 PM
05:52 PM
05:53 PM
05:55 PM
05:55 PM
05:56 PM
05:56 PM
05:58 PM
05:58 PM
05:58 PM
05:59 PM
06:00 PM
06:01 PM
06:01 PM
06:03 PM
06:05 PM
06:09 PM
06:09 PM
06:13 PM
06:14 PM
06:14 PM
06:18 PM
06:18 PM
06:18 PM
06:19 PM
06:22 PM
06:24 PM
06:25 PM
06:25 PM
06:26 PM
06:30 PM
06:32 PM
06:32 PM
06:33 PM
06:33 PM
06:34 PM
06:38 PM

Tali Beynon (Oct 17 2022 at 16:44):

Rubén Ballester said:

Viliam Vadocz 05:09 PM
...

Thanks! Don't know why but everything after 5:35pm got cut off. Weird.

Rubén Ballester (Oct 17 2022 at 16:45):

Tali Beynon said:

Rubén Ballester said:

Viliam Vadocz 05:09 PM
...

Thanks! Don't know why but everything after 5:35pm got cut off. Weird.

I'm sorry, the quality of the transcript I gave is a mess :(

Pim de Haan (Oct 17 2022 at 16:50):

I'll try to upload the Q&A afterwards

Samuel Gélineau (Oct 17 2022 at 16:59):

I'm working on the coexponential exercise. I am guessing that in addition to reversing the arrows in the universal property, I also need to replace product with coproduct? Like this?
X + A <--{u + id_A}-- coexp(B, A) + A <--{v}-- B
X + A <--{e}-- B

Euan Ong (Oct 17 2022 at 17:01):

I notice I’m still slightly confused about the claim that “o is no longer a function” — I intuitively see what’s going on, but why is the claim not “o is no longer an epimorphism”? (A possibly useful question to clarify things: what is the type of o in Haskell?)

Andrew Dudzik (Oct 17 2022 at 17:02):

Samuel Gélineau said:

I'm working on the coexponential exercise. I am guessing that in addition to reversing the arrows in the universal property, I also need to replace product with coproduct? Like this?
X + A <--{u + id_A}-- coexp(B, A) + A <--{v}-- B
X + A <--{e}-- B

This is my interpretation—reverse the arrows and replace product with coproduct. (there are two other possibilities if you only flip one of them, so you can also think of this as three exercises in one)

Samuel Gélineau (Oct 17 2022 at 17:04):

(there are two other possibilities if you only flip one of them, so you can also think of this as three exercises in one)

Hmm, but if we only flip one, then we're no longer looking for the dual concept. Do the other two exercises also lead to interesting, although not dual, concepts in Set?

Andrew Dudzik (Oct 17 2022 at 17:04):

Euan Ong said:

I notice I’m still slightly confused about the claim that “o is no longer a function” — I intuitively see what’s going on, but why is the claim not “o is no longer an epimorphism”? (A possibly useful question to clarify things: what is the type of o in Haskell?)

In that figure, o is surjective; every output position is populated. But some output positions are populated by two sources, so describing such a network requires a relation, also known as a "multi-valued function".

Petar Veličković (Oct 17 2022 at 17:09):

Thank you all so much for coming to the lecture once again, and for having such fantastic talking points, for both asking and answering questions! The community spirit is certainly strong here. :heart:

After two hours of lecturing, I'll need a thorough break. :sweat_smile:
But afterwards I will come back to this thread and attempt to answer any lingering questions that remain.
If you have any leftover feedback, please let me know (either here or by sending me a DM/email).

Pim de Haan (Oct 17 2022 at 17:10):

Seminar-2-QA-Report.pdf Here are the Q&As

Andrew Dudzik (Oct 17 2022 at 17:14):

Samuel Gélineau said:

(there are two other possibilities if you only flip one of them, so you can also think of this as three exercises in one)

Hmm, but if we only flip one, then we're no longer looking for the dual concept. Do the other two exercises also lead to interesting, although not dual, concepts in Set?

If you define exponentiability in a monoidal category, then taking the dual doesn't change the monoidal product.

But given the way things were defined in the lecture, it's appropriate to switch product to sum since it's defined by a universal property that gets reversed when we switch the arrows. And in any case, for a cartesian product if we don't make the switch we're guaranteed to get something boring.

Matthew Pugh (Oct 17 2022 at 17:15):

Is it sensible to think of the co-exponential object as the exponential object in the co-category?

Lucia (Oct 17 2022 at 17:23):

Where was the QnA?

I could only see that zoom chat

Thanks!

Andrew Dudzik (Oct 17 2022 at 17:24):

Lucia said:

Where was the QnA?

I could only see that zoom chat

Thanks!

It's a Zoom feature, there should be a button for it, though their clients do vary across platform.

Mihael Kovač (Oct 17 2022 at 17:37):

Since functors are "maps" between categories, is it correct to say that a functor is just a morphism in the category of categories (assuming such a thing is correctly defined)?

Matthew Pugh (Oct 17 2022 at 17:40):

@Mihael Kovač ncatlab is a resource I've found useful. https://ncatlab.org/nlab/show/Cat It says that functors are morphisms in the category of categories.

Yivan Zhang (Oct 17 2022 at 17:45):

@Aishwarya Balwani About disentanglement: the definition by [Higgins et al. 2018 https://arxiv.org/abs/1812.02230] can actually be interpreted from CT. Since a group is precisely a single-object category, a group action is a functor, and an equivariant map is a natural transformation, their definition (secretly) says that a disentangled representation is a natural transformation where the source category contains direct products of groups. On the other hand, we can also formulate the definition based on statistical independence in the category of probability measures, where $P(X \times Y)$ is all joint distributions and $PX \times PY$ is the product of two marginal distributions (disentangled). We can see that the concept of product is the core of disentanglement, and I think it is promising to unify these two definitions via CT. I'm currently working on this (sorry for the self-promotion: https://arxiv.org/abs/2208.02011), and an initial step is to go beyond groups because the invertibility requirement can be too restrictive in some problems. You can further check the concepts of the monoidal category and monoidal natural transformation, which can be useful here.

Aishwarya Balwani (Oct 17 2022 at 17:57):

@Yivan Zhang I was thinking of precisely some of Irina Higgins' work along these lines when I asked the question! More so perhaps in the context of the relatively newer Disentangling by Subspace Diffusion (which again is something that your answer has alluded to, except in the paper I've mentioned, you'd be talking about product manifolds rather than independent distributions?!). Your paper looks very interesting and I look forward to going through it (and also chatting more about these ideas if you're so inclined)!

Yivan Zhang (Oct 17 2022 at 18:15):

@Aishwarya Balwani Thank you for the reference! I'm actually not aware of this work. It seems they considered more structures (Lie groups), but the core is still the product (product metric here?). One thing that bothered me for a while is that the community seems to divide into two groups, algebra-based and statistics-based, and we still lack a common ground where we can talk about both. I think CT may be the solution.
I'm looking forward to chatting with you more, too!

Tali Beynon (Oct 17 2022 at 18:23):

Matthew Pugh said:

Mihael Kovač ncatlab is a resource I've found useful. https://ncatlab.org/nlab/show/Cat It says that functors are morphisms in the category of categories.

indeed, that's arguably the right the way to remember them. the candidates for morphisms in any category are "structure preserving maps", maps that preserve the structure of the objects in the category. when you apply this maxim to categories themselves, you get that a map between categories must "preserve morphism composition", since morphism composition is what defines the structure of a category. this property of preserving morphism composition, F(f) ; F(g) = F(f ; g) is indeed the (essential) defining property of a functor F. you can compose morphisms and apply functor, or apply functor and compose morphisms, and you get the same thing.

Tali Beynon (Oct 17 2022 at 18:28):

Yivan Zhang said:

Aishwarya Balwani About disentanglement:... We can see that the concept of product is the core of disentanglement, and I think it is promising to unify these two definitions via CT. I'm currently working on this (sorry for the self-promotion: https://arxiv.org/abs/2208.02011), and an initial step is to go beyond groups because the invertibility requirement can be too restrictive in some problems. You can further check the concepts of the monoidal category and monoidal natural transformation, which can be useful here.

that's super interesting. thanks for mentioning your research! how do transcend invertibility? do you move to a semigroupoid?

Yivan Zhang (Oct 17 2022 at 18:46):

Tali Beynon said:

Yivan Zhang said:
that's super interesting. thanks for mentioning your research! how do transcend invertibility? do you move to a semigroupoid?

I mean, we can consider a $\mathbf{B}M$ instead of a $\mathbf{B}G$ , or simply a category with all products, if we want to consider which operations are composable. I think the group structure is not essential in disentanglement and can be limiting. For example, the monoid of natural numbers for size and count is also useful. If we extend the definition in this direction, we can talk about more types of operations (I mentioned some ideas in the future work section). I'm glad that you find it interesting. Thanks!

Samuel Gélineau (Oct 17 2022 at 19:17):

I have a guess for the coexponential in Set exercise, but I don't like it one bit.

my solution

coexp(B, A) = if A is non-empty, then Void, otherwise B

We need to define a morphism v : B -> coexp(B, A) + A.
Case 1: A is non-empty. Then v : B -> Void + A, so we need to map every element in B to an element in A. Since A is non-empty, it has at least one element a ∈ A, to which we map everything. We pick this a arbitrarily.
Case 2: A is empty. Then v : B -> B + A, so we can simply map the elements of B to themselves.

We also need to produce a unique u : coexp(B, A) -> X for any e : B -> X + A. We proceed by cases.
Case 1: A is non-empty. Then u : Void -> X, so we must choose u : Void -> X to be absurd_Void, the unique morphism out of Void.
Case 2: A is empty. Then u : B -> X, and e : B -> X + Void must map every element of B to an X, so we choose that map for our u.

I don't like it because it is far from being as elegant as the exponential, because it doesn't seem very useful, and because of the arbitrary choice in the definition of v. Are we allowed to sweep this arbitrariness under the rug by saying that different choices lead to different definitions of the coexponential which are isomorphic to each other? If it was my choice of the object coexp(B, A) which was arbitrary, I would be fine with the idea that the different choices are isomorphic to each other, in the sense that we can find two morphisms between those two objects which compose to the identity. But what is the appropriate notion of isomorphism between two choices of definitions for v, which is itself a morphism?

Cheuk Kit Lee (Oct 17 2022 at 21:46):

Andrew Dudzik said:

Euan Ong said:

I notice I’m still slightly confused about the claim that “o is no longer a function” — I intuitively see what’s going on, but why is the claim not “o is no longer an epimorphism”? (A possibly useful question to clarify things: what is the type of o in Haskell?)

In that figure, o is surjective; every output position is populated. But some output positions are populated by two sources, so describing such a network requires a relation, also known as a "multi-valued function".

I'm slightly confused here. While I do understand why surjectivity implies o is not function, why is o a surjection to begin with?

Andrew Dudzik (Oct 17 2022 at 23:46):

Cheuk Kit Lee said:

Andrew Dudzik said:

Euan Ong said:

I notice I’m still slightly confused about the claim that “o is no longer a function” — I intuitively see what’s going on, but why is the claim not “o is no longer an epimorphism”? (A possibly useful question to clarify things: what is the type of o in Haskell?)

In that figure, o is surjective; every output position is populated. But some output positions are populated by two sources, so describing such a network requires a relation, also known as a "multi-valued function".

I'm slightly confused here. While I do understand why surjectivity implies o is not function, why is o a surjection to begin with?

In the GNN implementations we're referencing, o would send V^2 to V by summing along one axis, and V^2 to V^2 by the identity. So every element of V^2 is sent by o to two values. The point is that the tensor is copied and then used in two different ways.

Ibraheem Moosa/ইব্রাহীম মূসা (Oct 18 2022 at 00:26):

Hi guys. I am still trying to get my head around products and coproducts. I asked a question in QnA but I didn't understand the answers. If I understand correctly then the projections have to be epimorphism. For the Set category I can see it. But is it true in general? I tried to prove it but didn't get too far. Any explanation or link to further resources will be much appreciated. I also have another question related to this. How do we prove that two morphisms are same, in general? If we can show that for all f_1 we have f_1g=f_1h and for all f2 we have gf_2=hf2 then can we say g=h?

Samuel Gélineau (Oct 18 2022 at 02:29):

Ibraheem Moosa/ইব্রাহীম মূসা said:

How do we prove that two morphisms are same, in general? If we can show that for all f_1 we have f_1∘g=f_1∘h and for all f_2 we have g∘f_2=h∘f_2 then can we say g=h?

Well, yes, but not for a very good reason. It suffices to consider the case f_1 = id, because in that case f_1∘g = f_1∘h implies g = h, because g = id∘g = id∘h = h. So you're proving a lot more than you need to :)

For a specific category, you need to rely on the definition of morphisms for that category, e.g. for Set, it's extensional equality. For an abstract category ℂ, you have to rely on some information you have already been given about the category, e.g. if you know that it has a terminal object 1 and you have two morphisms f : A -> 1 and g : A -> 1, then you know that f = g because the definition of a terminal object says that there is a unique morphism from A to 1.

Alexey Uvarov (Oct 18 2022 at 06:04):

Hi all! I'm trying to understand what's going on in the message passing example. In the slide 99, what is the difference between Arg^{snd} and Arg^{rcv}? What does the second argument of tile() do?

Alberto Meneghello (Oct 18 2022 at 07:48):

Samuel Gélineau ha scritto:

I have a guess for the coexponential in Set exercise, but I don't like it one bit.

my solution

coexp(B, A) = if A is non-empty, then Void, otherwise B

We need to define a morphism v : B -> coexp(B, A) + A.
Case 1: A is non-empty. Then v : B -> Void + A, so we need to map every element in B to an element in A. Since A is non-empty, it has at least one element a ∈ A, to which we map everything. We pick this a arbitrarily.
Case 2: A is empty. Then v : B -> B + A, so we can simply map the elements of B to themselves.

We also need to produce a unique u : coexp(B, A) -> X for any e : B -> X + A. We proceed by cases.
Case 1: A is non-empty. Then u : Void -> X, so we must choose u : Void -> X to be absurd_Void, the unique morphism out of Void.
Case 2: A is empty. Then u : B -> X, and e : B -> X + Void must map every element of B to an X, so we choose that map for our u.

I don't like it because it is far from being as elegant as the exponential, because it doesn't seem very useful, and because of the arbitrary choice in the definition of v....

My result was quite similar, except for leaving undefined ( $\not=$ defined as Void) coexp(B,A) when both A and B are non-empty.

This is my solution for the co-exponential object in Set

Given two sets $X$ and $Y$ , we must find a set $X_Y := coexp(Y,X)$ and a function $\eta : X \longrightarrow X_Y \sqcup Y$ such that for every function $e : X \longrightarrow Z \sqcup Y$ there exists a unique function $k : X_Y \longrightarrow Z$ for which $e = (k\sqcup id_Y) \circ \eta$ .

The situation is depicted in the following figure: coexp.jpg

When $X=\emptyset$ (initial), for every set $Y$ we are forced to consider for $\eta$ the only empty function $\eta : \emptyset \longrightarrow \emptyset_Y \sqcup Y$ .
For every set $Z$ , the empty function $e: \emptyset \longrightarrow Z \sqcup Y$ trivially equals $(k\sqcup id_Y) \circ \eta$ , as $\emptyset = k \circ \emptyset$ for any function $k$ .
In order for such a $k: \emptyset_Y \longrightarrow Z$ to exist even in the case $Z=\emptyset$ , we must define $\emptyset_Y := \emptyset$ (hence unicity is given by $\emptyset$ initial).
When $X\not= \emptyset$ :
Lemma 1.1 $\ \ \ \eta(X) \subseteq \iota_X(X) := \{0\} \times X$
Suppose $\eta(x) = (1,y)$ for some $x \in X, y \in Y$ ; then we can consider $e := \iota_X : X \longrightarrow X \sqcup Y : x \longmapsto (0,x)$ , for which there is no $k: X_Y \longrightarrow Z$ making the diagram to commute. In fact, $(k \sqcup id_Y) \circ \eta (x) = (k \sqcup id_Y) \circ (1,y) = (1,y) \ \not= \ (0,x) = e(x)$ .
Corollary 1.2 $\ \ \ \eta$ must be injective.
As we can take $e = \iota_X$ (which is injective); then also the composite $X \stackrel{\eta}{\longrightarrow} Z \sqcup Y \stackrel{k\sqcup id_Y}{\longrightarrow} X \sqcup Y$ and hence $\eta$ must be injective.
Corollary 1.3 $\ \ \ \eta$ must be surjective on $X_Y$ .
Suppose there is some $(0,w) \in X_Y \sqcup Y \setminus \eta(X)$ ; then the function $k: X_Y \longrightarrow Z$ making the diagram to commute cannot be unique (provided $|Z| \geq 2$ ), as every function $k'$ obtained from $k$ by changing the image of $(0,w)$ will fit.
Lemma 1.4 $\ \ \ Y = \emptyset$
Otherwise we could define a function $e : X \longrightarrow X_Y \sqcup Y : x \longmapsto (1,y)$ for which there would be no way to make the diagram commutative. In fact, for every $k$ it is: $(k \sqcup id_Y) \circ \eta (x) = (k \sqcup id_Y) \circ (0,x') = (0,k(x')) \ \not= \ (1,y) = e(x)$ .

Summing up, the quantifications in the universal property compel the function $\eta$ to be bijective (L1.2, L1.3) and to depend solely on X (L1.1); the latter can be accomplished in two cases: when either $X=\emptyset$ (L0) or $Y=\emptyset$ (L1.4). Hence:

Theorem 2 $\ \ \$ Given two sets X, Y, their co-exponential is defined (up to isomorphism) as:
$(X_Y,\eta) \ := \ \begin{cases} (X, \iota_x) & \text{if} \ X=\emptyset \ \text{or} \ Y=\emptyset \\ \text{undefined} \ \ \ \ \ \ & \text{else} \end{cases}$

Andrew Dudzik (Oct 18 2022 at 08:34):

Ibraheem Moosa/ইব্রাহীম মূসা said:

Hi guys. I am still trying to get my head around products and coproducts. I asked a question in QnA but I didn't understand the answers. If I understand correctly then the projections have to be epimorphism. For the Set category I can see it. But is it true in general? I tried to prove it but didn't get too far. Any explanation or link to further resources will be much appreciated. I also have another question related to this. How do we prove that two morphisms are same, in general? If we can show that for all f_1 we have f_1g=f_1h and for all f2 we have gf_2=hf2 then can we say g=h?

The projections aren't always epimorphisms, even in Set. (take one set to be empty) In general, we can expect this to fail any time a product with a non-initial object equals the initial object. One example where this happens a lot is the category of affine schemes. (defined to be the opposite of the category of commutative rings)

Petar Veličković (Oct 18 2022 at 10:11):

Alexey Uvarov said:

Hi all! I'm trying to understand what's going on in the message passing example. In the slide 99, what is the difference between Arg^{snd} and Arg^{rcv}? What does the second argument of tile() do?

Hi @Alexey Uvarov, thanks for asking! The tile function is something like https://numpy.org/doc/stable/reference/generated/numpy.repeat.html#numpy.repeat.
It repeats (broadcasts) an array along a new axis, a certain number of times. The snd / rcv are arguments coming from the sender / receiver nodes, respectively, by either broadcasting along axis 0 or axis 1.

Viliam Vadocz (Oct 18 2022 at 12:46):

I read somewhere that the lectures are recorded and uploaded to youtube. Has yesterday's lecture been uploaded? Where could I find it?

Pim de Haan (Oct 18 2022 at 13:22):

You can find the link here https://www.youtube.com/watch?v=jU7KyZn_hBc

Cheuk Kit Lee (Oct 18 2022 at 14:00):

Andrew Dudzik said:

Cheuk Kit Lee said:

Andrew Dudzik said:

Euan Ong said:

I notice I’m still slightly confused about the claim that “o is no longer a function” — I intuitively see what’s going on, but why is the claim not “o is no longer an epimorphism”? (A possibly useful question to clarify things: what is the type of o in Haskell?)

In that figure, o is surjective; every output position is populated. But some output positions are populated by two sources, so describing such a network requires a relation, also known as a "multi-valued function".

I'm slightly confused here. While I do understand why surjectivity implies o is not function, why is o a surjection to begin with?

In the GNN implementations we're referencing, o would send V^2 to V by summing along one axis, and V^2 to V^2 by the identity. So every element of V^2 is sent by o to two values. The point is that the tensor is copied and then used in two different ways.

Thanks for the clarification that really helped! It seems like I mistook the indexed union as a cartesian product when I was thinking.

lawrence rowland (Oct 18 2022 at 19:14):

Still struggling way back at the idea of the unit set.
There is exactly one morphism from set A into unit set, which comprises one function mapping each element into the unit set, right ? But if any more than one of those mappings is applied, then isn’t it no longer reaching the unit set, because then the cardinality of the range would be greater than one ?
Forgive the stupid question.

Samuel Gélineau (Oct 18 2022 at 19:37):

lawrence rowland said:

Still struggling way back at the idea of the unit set.
There is exactly one morphism from set A into unit set, which comprises one function mapping each element into the unit set, right ? But if any more than one of those mappings is applied, then isn’t it no longer reaching the unit set, because then the cardinality of the range would be greater than one ?
Forgive the stupid question.

So, let's say you start with a set A with 3 elements A, B and C, and you have a function f which "maps each element into the unit set". Are you saying that after calling f(A), the set A now has 2 elements B and C while the unit set has one element A? And if you then call f(B), the set A now only has one element C while the unit set now has two elements A and B?

If so, please take a deep breath, because you're going to have to relearn everything you know about functions and sets! In a programming language, there are indeed data structures called "sets" to which we can add and remove elements, and there are procedures called "functions" which, when called, can add and remove elements from sets, and perform many more things. However, in math (and in functional programming), sets are immutable, and functions are pure. This means that a set's contents never changes, and that calling a function has no side effects whatsoever, so you don't have to worry about how many times the functions are called and in which order, you only have to understand what the function returns given a particular input.

So when we say that f "maps each elements into the unit set", we are definitely not saying that calling f has the side effect of removing an element from the set A and adding it to the unit set. If the set A has 3 elements, it has always had 3 elements and will always have. The unit set has and always will have one element. Let's call this element "X". Calling f(A) does not perform any side-effect, it just returns X. Calling f(B) also returns X. And so does calling f(C).

Does that help? My apologies if I have completely misunderstood the nature of your confusion.

lawrence rowland (Oct 18 2022 at 21:33):

Hi Samuel,
I think you have opened up a whole new area for me.
Yes that was my misunderstanding, and you have explained it very clearly. Thank you so much for taking the time.
I already think this course is wonderful for opening up a community of interested people to discuss basic misunderstandings with . I was also impressed with how Petar introduced category theory with specific results like this one , only after the basic definition. I had grown used to following other category theory introductions which moved through the same routine examples - and this approach has shaken me out of complacency.

Gal K (Oct 19 2022 at 15:41):

The categories\functors lecture was a great refresher for me.
Amusingly, I found myself more confused by the GNN\message-passing part although ML theory is my main area of research (though I have no experience with neither GNNs or message-passing).
Is this going to be critical for understanding the next lectures? If so, are there any good sources that can help me close the gap?

Thanks!

Petar Veličković (Oct 19 2022 at 15:43):

Gal K said:

The categories\functors lecture was a great refresher for me.
Amusingly, I found myself more confused by the GNN\message-passing part although ML theory is my main area of research (though I have no experience with neither GNNs or message-passing).
Is this going to be critical for understanding the next lectures? If so, are there any good sources that can help me close the gap?

Thanks!

Thanks for raising this @Gal K and for your very kind comments on the lecture!

I don't think it will be critical. Though it will be helpful for @Pim de Haan 's and @Andrew Dudzik's lectures, which will talk about GNNs also.

I recommend checking this lecture from me: https://www.youtube.com/watch?v=uF53xsT7mjc
to get a better feel for GNNs, message passing, and the underlying principles (e.g. permutation equivariance) and perspectives (e.g. graph signal processing, NLP, PGMs...). Hope it helps!