You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Hi all, we just put up our paper Categorical Foundations of Gradient-Based Learning up on arxiv. This is joint work with @Geoff Cruttwell , Neil Ghani, @Paul and @Fabio Zanasi .
We provided a 2-categorical foundation for many types of neural networks in terms three things: 1) parameterized maps (the construction), 2) lenses, and 3) reverse derivative categories.
This includes learning on well-known Euclidean spaces, but also weird things like learning on Boolean Circuits (since, surprisingly, they are a reverse derivative category too).
It also turns out a bunch of things called "optimizers" are lenses as well - starting from standard gradient descent, through Momentum and Nesterov Momentum to more complex optimizers like Adagrad and Adam.
This was also a bit surprising but it somehow all fits together - since optimizers are lenses they end up being 2-cells in our category. But I'll stop here and defer you to all the details in the paper.
Looking for any feedback or thoughts you may have about it :)
This is great @Bruno Gavranovic I will be reading this in the next couple of weeks. Will let you know if I have any comments
Well I just discovered that this stream, which I previously wasn't subscribed to, has a lot of stuff I'm interested in
I'll also supplement the paper with a video of the presentation I did on the paper. The video should be much more informal, visual and it also takes a slightly different perspective on the paper.
Loving it! You mention that reverse differential categories are good for classification b/c the target dimension is low. Does that mean we should expect forward differentials to be better for generative tasks?
That's a great question and something I've been wondering about as well. I suspect so, but honestly, I don't know - I'd love to hear more from some AI expert on this