Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: learning: questions

Topic: deep learning

Joshua Meyers (Mar 01 2021 at 18:48):

What are your goals in learning category theory @joaogui1 (he/him)?

joaogui1 (he/him) (Mar 01 2021 at 20:32):

Because I love math and the abstractions of CT look really interesting. Second because I'm working on Deep Learning and I feel like a lot of stuff could benefit at least from categorical inspirations (compositionality is pretty much one of the assumptions of Deep Learning). Finally I feel like compositionality is a powerful idea in general, showing up in things like shell, Lisps. JAX and Julia

Jules Hedges (Mar 01 2021 at 20:37):

Ok! For deep learning in particular there are (a very small number of) specific papers you should read - with Backprop As Functor at the top of the list if you don't already know it - and (an equally small number of) specific people you should talk to - I'll tag @Bruno Gavranovic

joaogui1 (he/him) (Mar 01 2021 at 20:52):

He seems to have some pretty cool papers, nice!

John Baez (Mar 01 2021 at 20:56):

Great question, @Joshua Meyers! There are many reasons for studying category theory, and how one should study it depends on what one is trying to do.

joaogui1 (he/him) (Mar 01 2021 at 21:01):

Hey @Bruno Gavranovic what do you recommend for someone interested in ACT for Deep Learning?

Joshua Meyers (Mar 01 2021 at 23:37):

I definitely also recommend Backprop as Functor if you haven't already read it

Joshua Meyers (Mar 01 2021 at 23:50):

If you love math and are interested in category theory in its own right, you should look at something which is about pure category theory. I really like these notes: https://home.gwu.edu/~wschmitt/papers/cat.pdf After working through these notes, I used Emily Riehl's book, which has a lot of examples from different fields of math --- if you don't know those fields, you won't get all the examples, but you can just skip the examples you don't get. I've also heard other computer science people say they liked Awodey's book, though I have not looked at it much personally. "Algebra Chapter 0" and "Topology: A Categorical Approach" would be great if you want to learn algebra and topology, do you want to learn these things?

Joshua Meyers (Mar 01 2021 at 23:50):

There are many directions you can go with this, it depends largely on which direction you want to go in.

joaogui1 (he/him) (Mar 02 2021 at 00:06):

They are among the things I want to learn. I've always been fascinated by Algebra and tried to take it when I began college, but Brazil's curricula are very strict and I only maanaged to audit the class. And topology also looks cool and useful for DL

Joshua Meyers (Mar 02 2021 at 00:37):

Then yeah those books sound good!

Bruno Gavranović (Mar 02 2021 at 17:52):

joaogui1 (he/him) said:

Hey Bruno Gavranovic what do you recommend for someone interested in ACT for Deep Learning?

Hi Joao! Unfortunately ACT for Deep Learning is a relatively niche field, and I'm not sure if there's any introductory texts focused on understanding CT with examples from Deep learning.
If what you're looking for is papers, then you're sort of in luck, because there's a few. I compiled a list a while ago https://github.com/bgavran/Category_Theory_Machine_Learning but most of these are probably pretty abstract if you're just getting into this. Nonetheless, as Jules says - it's good to read those and see what kind of things people are doing, even if its just getting a taste.

Interestingly enough, there's been a new paper on arxiv today (https://arxiv.org/abs/2103.01189) about learners, and actually we should have our paper on Categorical Foundations of Gradient-Based Learning up on arxiv very very soon :)

Jules Hedges (Mar 02 2021 at 18:05):

Besides backprop as functor, you could do worse than starting with one of Bruno's recorded talks on Youtube

Jules Hedges (Mar 02 2021 at 18:05):

Since this is a recently active research topic, a bunch of happened during coronatime, which means recorded seminars

Matteo Capucci (he/him) (Mar 02 2021 at 22:17):

Also I suggest getting acquainted with optics (e.g. from Riley's 'Category of optics' paper) and theleological string diagrams (I guess from @Jules Hedges first paper on games? idk)

Jules Hedges (Mar 02 2021 at 23:25):

This is probably overkill as a starting point...

Jules Hedges (Mar 02 2021 at 23:27):

If you follow machine learning in the same direction as us (which I think is the "canonical" way of taking seriously the existing compositionality of deep learning - but being wrong is always an option, there could be some totally different way you could go) - you'll eventually meet the question "what the heck does game theory have to do with anything"

joaogui1 (he/him) (Mar 03 2021 at 00:11):

yeah, I keep seeing stuff like open games, what the heck does it have to do with anything?

John Baez (Mar 03 2021 at 00:57):

Learning can be seen as a kind of game: that's a pretty basic observation.

Jason Erbele (Mar 03 2021 at 01:44):

That's kind of what I was thinking – why wouldn't game theory be involved in machine learning?

Georgios Bakirtzis (Mar 03 2021 at 01:48):

Ok why would it? (I know the answer btw but I don't think it is a basic observation and I am not sure I can answer succinctly)

Jason Erbele (Mar 03 2021 at 02:02):

As an outsider to ML, I don't have a good answer for exactly how it should connect. But there has to be more than a superficial connection between learning goals and player goals, learning strategies and game theoretic strategies, etc. Any kind of learning system/environment should at least be expressible as a formal game, and from there, game theory ought to at least offer some insights. Same should apply to human learning, as well.

Jason Erbele (Mar 03 2021 at 02:05):

By the way, just because something is a basic observation doesn't necessarily mean it's an easy observation to make, especially for the first time. E.g. it's a basic observation that the number zero is important to mathematics.

Matteo Capucci (he/him) (Mar 03 2021 at 07:48):

Jules Hedges said:

This is probably overkill as a starting point...

Well, I said getting acquainted, not full expert. Bruno mentions optics like 5 minutes in his MSP101 talk, I guess. Also many theleological diagrams are drawn, so it's better to understand what's going on.

Matteo Capucci (he/him) (Mar 03 2021 at 07:49):

Maybe the first of your Max Planck lectures would be a great place to get a quick intro to the diagrams

Fawzi Hreiki (Mar 03 2021 at 09:50):

I think the intuitive answer is that game theory studies distributed decision making and that intelligence rarely (if ever) emerges from individual decision making as opposed to distributed decision making

Jules Hedges (Mar 03 2021 at 10:44):

This is all true, but the connection been categorical approach to machine learning and game theory is more subtle than that. They're structurally the same. I think the true answer is actually closer to "game theory is a kind of machine learning" than the other way round - the central idea of compositional game theory turned out in retrospect to be a sort of "backpropagation of payoffs". In Strathclyde we're working towards a proper grand unification, but none of it's written down yet...

joaogui1 (he/him) (Mar 03 2021 at 10:48):

This looks extremly interesting :D

joaogui1 (he/him) (Mar 03 2021 at 22:59):

Matteo Capucci (he/him) said:

Jules Hedges said:

This is probably overkill as a starting point...

Well, I said getting acquainted, not full expert. Bruno mentions optics like 5 minutes in his MSP101 talk, I guess. Also many theleological diagrams are drawn, so it's better to understand what's going on.

Is there a link to a recording of that talk?

joaogui1 (he/him) (Mar 03 2021 at 23:00):

John Baez said:

Which talk? Right now we're talking about ancient Greek phonetics for some reason.

Sorry, still getting the hang of zulip

John Baez (Mar 03 2021 at 23:03):

Yeah, I do that too: respond to a comment after the conversation has moved on, without noticing...

John Baez (Mar 03 2021 at 23:05):

I found slides for an MSP101 talk by @Bruno Gavranovic on "Categorical Foundation of Gradient-Based Learning":

http://msp.cis.strath.ac.uk/msp101.html

But no video, because it's a talk from "BC": before coronavirus.

joaogui1 (he/him) (Mar 03 2021 at 23:08):

A shame, but thanks!

Dylan Braithwaite (Mar 03 2021 at 23:11):

I think the talk you mention does have a video link on that page, John: https://www.youtube.com/watch?v=ji8MHKlQZ9w

Conor McBride (Mar 03 2021 at 23:12):

You beat me to it. It was a great talk!

joaogui1 (he/him) (Mar 03 2021 at 23:15):

Thanks :D

John Baez (Mar 03 2021 at 23:41):

Oh - it said "slides" right over the talk title but those slides were for the previous talk.

John Baez (Mar 03 2021 at 23:42):

I found this talk by cleverly typing "MSP101 bruno" into Google.

Notification Bot (Mar 04 2021 at 08:15):

This topic was moved here from #learning: questions > beginner questions by Matteo Capucci (he/him)

Notification Bot (Mar 04 2021 at 08:15):

This topic was moved by Matteo Capucci (he/him) to #general: off-topic > greek shenanigans and misspelled words

Matteo Capucci (he/him) (Mar 04 2021 at 08:17):

John Baez said:

I found this talk by cleverly typing "MSP101 bruno" into Google.

never underestimate the power of googling

Matteo Capucci (he/him) (Mar 04 2021 at 08:18):

There's also his other talk for the CyberCats seminar https://www.youtube.com/watch?v=tKM8JdXJEII
AFAIR there's a lot of overlap between the two, but just in case

Dylan Braithwaite (Mar 04 2021 at 09:00):

Jules Hedges said:

I think the true answer is actually closer to "game theory is a kind of machine learning" than the other way round

The correspondence I've seen already is that there's a functor from learners into games, which initially seems to suggest the converse to that. Does that suggest there's an interesting map in the other direction?

Jason Erbele (Mar 04 2021 at 09:14):

If there is a nice map in both directions (learners to games and games to learners), do we further have an adjunction between the two?

Matteo Capucci (he/him) (Mar 04 2021 at 10:01):

It really depends on how you define stuff. I'm quite eager to make game and learners the same thing, since their categorical semantics is so similar. One key difference is in the computation of equilibria/learning process, but they can be put on the same footing as well (by swapping a gadget attached to them)
It'd be nice to link to a paper now, but there's not one yet on this : :sweat_smile:

Jules Hedges (Mar 04 2021 at 10:49):

Dylan Braithwaite said:

Jules Hedges said:

I think the true answer is actually closer to "game theory is a kind of machine learning" than the other way round

The correspondence I've seen already is that there's a functor from learners into games, which initially seems to suggest the converse to that. Does that suggest there's an interesting map in the other direction?

If you take the definitions exactly as given in Backprop As Functor and Compositional Game Theory then in a completely literal sense you get a functor from learners to games (I wrote the construction in this short tech report: https://arxiv.org/abs/1902.08666), and I think not the other way. But in private we've been fiddling with the definitions of both so they are both subsumed under a single definition, since we had a gut feeling that it was a good idea. Mostly this involves making learners look more like games. In particular, almost all of the differences between the 2 papers arise from the fact that they're not talking about the differentiable structure that's in machine learning, they stick to Set (which in turn buys them some more generality to talk about other kinds of learners that don't use differentiation). Once you put the differentiable structure back into Backprop As Functor it really starts to look the same as open games - not that there's an adjunction or even an equivalence, but that they're literally the same category

Bruno Gavranović (Mar 04 2021 at 16:48):

Exactly - I'm still quite fascinated that it's now plausible to say that exactly the same structure might be shared by learners and open games. Now I'm actually thinking it might even be a good idea to create a stream here about category theory & machine learning... there's definitely papers piling up, having finally uploaded ours! :) Categorical Foundations of Gradient-Based Learning

Morgan Rogers (he/him) (Mar 04 2021 at 16:50):

@Bruno Gavranovic do you want to mention that paper over on #practice: our papers ?

Bruno Gavranović (Mar 04 2021 at 16:50):

Ah, I was just looking if there's a place to mention papers. Yes I will!

Peter Arndt (Mar 05 2021 at 00:01):

Please do create that stream on category theory & machine learning! I would really like to teach a course on that, but I need some good arguments to convince the rest of my department...

Peiyuan Zhu (Apr 13 2021 at 02:16):

I’m reading Baez’s argument on Bayesian statistics. It’s interesting how he connects informational evolution and biological evolution.

Peiyuan Zhu (Feb 12 2022 at 01:54):

I'm reading this article "Mathematics for AI: Categories, Toposes, Types": https://www.cambridge.org/core/books/mathematics-for-future-computing-and-communications/mathematics-for-ai-categories-toposes-types/F75B023A3C42FAB5D4F186376216FC76

Here it talks about topos of a chain of feedforward neural network (is it just chain of neurons or actually chain of networks??). image.png What does "freely generated" mean? Does "possible activities" mean the number of neurons that are active? Is $w_{ba}$ the same of $W_{ba}$ ? Does the functor sends a neuron to the multiplication of weights that precedes the neuron? What's the difference between $X$ and $\Pi$ and $w$ ? What is the functor $\mathbb{X}$ trying to say? Since $w_a'$ is undefined, I understand it as multiplying all the preceding weights like $W_a$, is the map $(w_{ba},w_a')$ removing the weights between $a$ and $b$ to get weights up to $a$ ? Is there a logical reason of defining such a presheaf?

Here it talks about the subobject classifier of this topos, which I know exists but not sure why (How to get the subobject classifier out of a topos?
Can anyone provide a reference?). What does "localization $\mathbb{1}|a$ mean? Why is it increasingly determined from initial layer to final layer? Why is it the "output information that an external observer can deduce from the activity within the inner layers"? image.png

Here it's talking about geometric morphism. Should I understand $P(A)$ as the power set here? (which can be identified with subobject classifiers) image.png