You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
What are your goals in learning category theory @joaogui1 (he/him)?
Because I love math and the abstractions of CT look really interesting. Second because I'm working on Deep Learning and I feel like a lot of stuff could benefit at least from categorical inspirations (compositionality is pretty much one of the assumptions of Deep Learning). Finally I feel like compositionality is a powerful idea in general, showing up in things like shell, Lisps. JAX and Julia
Ok! For deep learning in particular there are (a very small number of) specific papers you should read - with Backprop As Functor at the top of the list if you don't already know it - and (an equally small number of) specific people you should talk to - I'll tag @Bruno Gavranovic
He seems to have some pretty cool papers, nice!
Great question, @Joshua Meyers! There are many reasons for studying category theory, and how one should study it depends on what one is trying to do.
Hey @Bruno Gavranovic what do you recommend for someone interested in ACT for Deep Learning?
I definitely also recommend Backprop as Functor if you haven't already read it
If you love math and are interested in category theory in its own right, you should look at something which is about pure category theory. I really like these notes: https://home.gwu.edu/~wschmitt/papers/cat.pdf After working through these notes, I used Emily Riehl's book, which has a lot of examples from different fields of math --- if you don't know those fields, you won't get all the examples, but you can just skip the examples you don't get. I've also heard other computer science people say they liked Awodey's book, though I have not looked at it much personally. "Algebra Chapter 0" and "Topology: A Categorical Approach" would be great if you want to learn algebra and topology, do you want to learn these things?
There are many directions you can go with this, it depends largely on which direction you want to go in.
They are among the things I want to learn. I've always been fascinated by Algebra and tried to take it when I began college, but Brazil's curricula are very strict and I only maanaged to audit the class. And topology also looks cool and useful for DL
Then yeah those books sound good!
joaogui1 (he/him) said:
Hey Bruno Gavranovic what do you recommend for someone interested in ACT for Deep Learning?
Hi Joao! Unfortunately ACT for Deep Learning is a relatively niche field, and I'm not sure if there's any introductory texts focused on understanding CT with examples from Deep learning.
If what you're looking for is papers, then you're sort of in luck, because there's a few. I compiled a list a while ago https://github.com/bgavran/Category_Theory_Machine_Learning but most of these are probably pretty abstract if you're just getting into this. Nonetheless, as Jules says - it's good to read those and see what kind of things people are doing, even if its just getting a taste.
Interestingly enough, there's been a new paper on arxiv today (https://arxiv.org/abs/2103.01189) about learners, and actually we should have our paper on Categorical Foundations of Gradient-Based Learning up on arxiv very very soon :)
Besides backprop as functor, you could do worse than starting with one of Bruno's recorded talks on Youtube
Since this is a recently active research topic, a bunch of happened during coronatime, which means recorded seminars
Also I suggest getting acquainted with optics (e.g. from Riley's 'Category of optics' paper) and theleological string diagrams (I guess from @Jules Hedges first paper on games? idk)
This is probably overkill as a starting point...
If you follow machine learning in the same direction as us (which I think is the "canonical" way of taking seriously the existing compositionality of deep learning - but being wrong is always an option, there could be some totally different way you could go) - you'll eventually meet the question "what the heck does game theory have to do with anything"
yeah, I keep seeing stuff like open games, what the heck does it have to do with anything?
Learning can be seen as a kind of game: that's a pretty basic observation.
That's kind of what I was thinking – why wouldn't game theory be involved in machine learning?
Ok why would it? (I know the answer btw but I don't think it is a basic observation and I am not sure I can answer succinctly)
As an outsider to ML, I don't have a good answer for exactly how it should connect. But there has to be more than a superficial connection between learning goals and player goals, learning strategies and game theoretic strategies, etc. Any kind of learning system/environment should at least be expressible as a formal game, and from there, game theory ought to at least offer some insights. Same should apply to human learning, as well.
By the way, just because something is a basic observation doesn't necessarily mean it's an easy observation to make, especially for the first time. E.g. it's a basic observation that the number zero is important to mathematics.
Jules Hedges said:
This is probably overkill as a starting point...
Well, I said getting acquainted, not full expert. Bruno mentions optics like 5 minutes in his MSP101 talk, I guess. Also many theleological diagrams are drawn, so it's better to understand what's going on.
Maybe the first of your Max Planck lectures would be a great place to get a quick intro to the diagrams
I think the intuitive answer is that game theory studies distributed decision making and that intelligence rarely (if ever) emerges from individual decision making as opposed to distributed decision making
This is all true, but the connection been categorical approach to machine learning and game theory is more subtle than that. They're structurally the same. I think the true answer is actually closer to "game theory is a kind of machine learning" than the other way round - the central idea of compositional game theory turned out in retrospect to be a sort of "backpropagation of payoffs". In Strathclyde we're working towards a proper grand unification, but none of it's written down yet...
This looks extremly interesting :D
Matteo Capucci (he/him) said:
Jules Hedges said:
This is probably overkill as a starting point...
Well, I said getting acquainted, not full expert. Bruno mentions optics like 5 minutes in his MSP101 talk, I guess. Also many theleological diagrams are drawn, so it's better to understand what's going on.
Is there a link to a recording of that talk?
John Baez said:
Which talk? Right now we're talking about ancient Greek phonetics for some reason.
Sorry, still getting the hang of zulip
Yeah, I do that too: respond to a comment after the conversation has moved on, without noticing...
I found slides for an MSP101 talk by @Bruno Gavranovic on "Categorical Foundation of Gradient-Based Learning":
http://msp.cis.strath.ac.uk/msp101.html
But no video, because it's a talk from "BC": before coronavirus.
A shame, but thanks!
I think the talk you mention does have a video link on that page, John: https://www.youtube.com/watch?v=ji8MHKlQZ9w
You beat me to it. It was a great talk!
Thanks :D
Oh - it said "slides" right over the talk title but those slides were for the previous talk.
I found this talk by cleverly typing "MSP101 bruno" into Google.
This topic was moved here from #learning: questions > beginner questions by Matteo Capucci (he/him)
This topic was moved by Matteo Capucci (he/him) to #general: off-topic > greek shenanigans and misspelled words
John Baez said:
I found this talk by cleverly typing "MSP101 bruno" into Google.
never underestimate the power of googling
There's also his other talk for the CyberCats seminar https://www.youtube.com/watch?v=tKM8JdXJEII
AFAIR there's a lot of overlap between the two, but just in case
Jules Hedges said:
I think the true answer is actually closer to "game theory is a kind of machine learning" than the other way round
The correspondence I've seen already is that there's a functor from learners into games, which initially seems to suggest the converse to that. Does that suggest there's an interesting map in the other direction?
If there is a nice map in both directions (learners to games and games to learners), do we further have an adjunction between the two?
It really depends on how you define stuff. I'm quite eager to make game and learners the same thing, since their categorical semantics is so similar. One key difference is in the computation of equilibria/learning process, but they can be put on the same footing as well (by swapping a gadget attached to them)
It'd be nice to link to a paper now, but there's not one yet on this : :sweat_smile:
Dylan Braithwaite said:
Jules Hedges said:
I think the true answer is actually closer to "game theory is a kind of machine learning" than the other way round
The correspondence I've seen already is that there's a functor from learners into games, which initially seems to suggest the converse to that. Does that suggest there's an interesting map in the other direction?
If you take the definitions exactly as given in Backprop As Functor and Compositional Game Theory then in a completely literal sense you get a functor from learners to games (I wrote the construction in this short tech report: https://arxiv.org/abs/1902.08666), and I think not the other way. But in private we've been fiddling with the definitions of both so they are both subsumed under a single definition, since we had a gut feeling that it was a good idea. Mostly this involves making learners look more like games. In particular, almost all of the differences between the 2 papers arise from the fact that they're not talking about the differentiable structure that's in machine learning, they stick to Set (which in turn buys them some more generality to talk about other kinds of learners that don't use differentiation). Once you put the differentiable structure back into Backprop As Functor it really starts to look the same as open games - not that there's an adjunction or even an equivalence, but that they're literally the same category
Exactly - I'm still quite fascinated that it's now plausible to say that exactly the same structure might be shared by learners and open games. Now I'm actually thinking it might even be a good idea to create a stream here about category theory & machine learning... there's definitely papers piling up, having finally uploaded ours! :) Categorical Foundations of Gradient-Based Learning
@Bruno Gavranovic do you want to mention that paper over on #practice: our papers ?
Ah, I was just looking if there's a place to mention papers. Yes I will!
Please do create that stream on category theory & machine learning! I would really like to teach a course on that, but I need some good arguments to convince the rest of my department...
I’m reading Baez’s argument on Bayesian statistics. It’s interesting how he connects informational evolution and biological evolution.
I'm reading this article "Mathematics for AI: Categories, Toposes, Types": https://www.cambridge.org/core/books/mathematics-for-future-computing-and-communications/mathematics-for-ai-categories-toposes-types/F75B023A3C42FAB5D4F186376216FC76
Here it talks about topos of a chain of feedforward neural network (is it just chain of neurons or actually chain of networks??). image.png What does "freely generated" mean? Does "possible activities" mean the number of neurons that are active? Is the same of ? Does the functor sends a neuron to the multiplication of weights that precedes the neuron? What's the difference between and and ? What is the functor trying to say? Since is undefined, I understand it as multiplying all the preceding weights like $W_a$, is the map removing the weights between and to get weights up to ? Is there a logical reason of defining such a presheaf?
Here it talks about the subobject classifier of this topos, which I know exists but not sure why (How to get the subobject classifier out of a topos?
Can anyone provide a reference?). What does "localization mean? Why is it increasingly determined from initial layer to final layer? Why is it the "output information that an external observer can deduce from the activity within the inner layers"? image.png
Here it's talking about geometric morphism. Should I understand as the power set here? (which can be identified with subobject classifiers) image.png