Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: community: our work

Topic: Bruno Gavranović

Bruno Gavranović (Mar 14 2024 at 12:08):

This is as good as time as any to make a thread here: I've published my PhD thesis!

The thesis title is

Fundamental Components of Deep Learning: A category-theoretic approach

You can find it here, together with the accompanying blog post. At some point I will give more context around this, but for now I'll just leave here the last paragraph of the thesis which summarises how I see this thesis fitting into a broader research programme which studies the process by which we find structure in data, and attempts to formalise this with concrete implementation of systems which do that via iterative updates.

Screenshot_20240314_120107.png

I'm looking forward to hearing your thoughts and comments.

Bruno Gavranović (Mar 14 2024 at 12:13):

And lastly, I want to promote here our latest position paper Categorical Deep Learning: An Algebraic Theory of Architectures which directly builds on the $\mathbf{Para}$ construction and provides a general theory of architectures of neural networks that directly subsumes Geometric Deep Learning : it formulates equivariant maps as morphisms of monad algebra homomorphisms, and shows how by genrealising these to particular endofunctor algebra homomorphisms we can go beyond equivariance to invertible transformations, and start capturing transformations which aren't invertible, such as constructors for lists, trees, and all sorts of inductive and coinductive structures in theoretical computer science.

Bruno Gavranović (Mar 14 2024 at 12:14):

This revealed some interesting correspondences, for instance those between parametric mealy machines and recurrent neural networks. We're still trying to figure out many parts of this and I imagine lots of people in the community here might have interesting ideas about structured computation that can be captured in these ways.

Screenshot_20240314_121419.png

Ralph Sarkis (Mar 14 2024 at 15:27):

Bruno Gavranović said:

I'll just leave here the last paragraph of the thesis which summarises how I see this thesis fitting into a broader research programme [...]
I'm looking forward to hearing your thoughts and comments.

That resonates with a vague thought I had while watching this video yesterday. In it, they build an AI agent to play Trackmania (a car racing game), the agent is faster than humans, but very inconsistent due to some chaotic behavior in the physics engine. On the other hand, humans can get fairly consistent (modulo human parameters like fatigue) but not at faster speeds. I was thinking about how to train for consistency in such a setting, and I believe that maybe using a compositional model would help (if the agent understands which settings are harder to navigate like humans do, it will be more careful and try to mitigate the inconsistencies). Are there some results about compositionality as a means to achieve consistency?

Bruno Gavranović (Mar 15 2024 at 22:32):

Ah, absolutely!

I think compositional models are the key to achieving consistency. Take for example the universal approximation theorem: it tells you that with an infinitely wide network you can approximate any function. And while that's true, there's a very neat visual proof that it does not tell you anything about generalisation: say, fitting a function $f(x)=x^2$ is done by fitting a bunch of squares under it, meaning for any finite dataset there's always a part of the input which the neural network hasn't seen, and has no means of generalising well on.
So there is no 'consistency' as you call it: you can't fully learn function $x^2$ with a single linear layer, because you can only produce a linear combination of inputs, which $x^2$ is not.

I actually watched the other video from the same creator on Youtube and I believe the same story applies here. This current Trackmania model seems to largely overfit, and can easily get confused on out of distribution examples.
For this particular case, some physics-based priors seem to be necessary to achieve good generalisations. But I'm not sure what algebraic structure can be used to encode them.

I think the same issue arises when dealing with, say, structural recursion. There is ample evidence (1, 2) that transformers perform an analogus operation to 'fitting squares' when learning complex algorithms, and have no architectural bias that would allow them to learn how to perform, say, structural recursion. They can learn it for 1,3,10, or perhaps 20 steps, but they eventually break down.

The deep learning community indeed identified compositionality as a useful tool in describing the world around us --- take Yoshua Bengio's Turing Lecture which states that 'Compositionality is useful to describe the world around us effectively'. But no specific connections to category theory have been made.

Graham Manuell (Mar 16 2024 at 06:38):

Perhaps the consistency problems can be dealt with by explicitly training for consistency by selecting for strategies that take longer before diverging when applied to very small perturbations of the input?

Noah Chrein (Apr 03 2024 at 19:13):

Bruno can I ask about your involvement with Paul lessard and symbolica? He was just on machine learning street talk and they claimed to have raised $30M, the main paper they referenced was CDL: algebraic theory of architectures.

What is this $30m specifically for? Are you accruing compute to now train a model based on Para?

Noah Chrein (Apr 03 2024 at 19:15):

Congrats on publishing the thesis btw!

Ryan Wisnesky (Apr 09 2024 at 22:00):

https://finance.yahoo.com/news/vinod-khosla-betting-former-tesla-130000001.html

fosco (Apr 10 2024 at 07:58):

Noah Chrein said:

What is this $30m specifically for?

And even more importantly, how much of these 30m are you giving to your fellow category theorists to fund their research? :hearts:

Morgan Rogers (he/him) (Apr 10 2024 at 07:59):

@Ryan Wisnesky please don't post links without accompanying text...

Morgan Rogers (he/him) (Apr 10 2024 at 08:24):

I clicked anyway. It made me sad.

Khosla admits he does not understand the math-filled paper—pointing out there are very few people in the world who fully understand category theory—“when these really smart people gravitate to an idea, it’s an important idea,” he said.

:man_facepalming: I really do not want CT to be beached by a tech hype tide... but at least if it's successful enough policymakers will put money into it for a while.

Bruno Gavranović (Apr 10 2024 at 08:41):

Noah Chrein said:

Bruno can I ask about your involvement with Paul lessard and symbolica? He was just on machine learning street talk and they claimed to have raised $30M, the main paper they referenced was CDL: algebraic theory of architectures.

What is this $30m specifically for? Are you accruing compute to now train a model based on Para?

Absolutely, @Noah Chrein .

This is as good time as any to announce a few different pieces of exciting news:

I have started a new position at Symbolica, leading a research programme on categorical deep learning
We're developing foundation models for structured reasoning: models that manipulate structured data, learn algebraic structure, and do so with interpretable and verifiable logic
Symbolica has raised a $31M funding round led by Vinod Khosla giving us the runway, the space, and the resources to pursue this ambitious goal.

Bruno Gavranović (Apr 10 2024 at 08:42):

I'm incredibly excited about this, for a few different reasons.

Monoids, categories, universal properties and other concepts from category theory have been an indespensable tool for myself and many other scientists for understanding the world. It allowed us to find robust patterns in data, and also communicate, verify and explain our reasoning to one another. In many ways, isn't this the goal of deep learning? Creation of models which understand the world in robust, generalisable, but also verifiable ways?

Bruno Gavranović (Apr 10 2024 at 08:45):

Noah Chrein said:

What is this $30m specifically for?

The funding is to be used for the development of this research programme, staffing, and compute.

Bruno Gavranović (Apr 10 2024 at 08:46):

fosco said:

how much of these 30m are you giving to your fellow category theorists to fund their research? :hearts:

We're hiring fellow category theorists! We're opening up offices in UK and AUS - check out the job ads.

Bruno Gavranović (Apr 10 2024 at 08:50):

Morgan Rogers (he/him) said:

Khosla admits he does not understand the math-filled paper—pointing out there are very few people in the world who fully understand category theory—“when these really smart people gravitate to an idea, it’s an important idea,” he said.

:man_facepalming: I really do not want CT to be beached by a tech hype tide... but at least if it's successful enough policymakers will put money into it for a while.

This is a justified concern, and one I've been thinking about myself. It's easy to overpromise and build hype! I'm trying hard - together with some other fantastic people we've hired - to keep us grounded.

Khosla indeed isn't privy to insights behind CT, but our team is.

Morgan Rogers (he/him) (Apr 10 2024 at 09:15):

You should advertise those over in #community: positions !

John Baez (Apr 10 2024 at 14:22):

I did it - easy enough to do while I was looking over the ads.

Bruno Gavranović (Apr 10 2024 at 14:42):

Thanks @John Baez !

Noah Chrein (Apr 10 2024 at 15:19):

Thanks for the info Bruno, this is wonderful.

Morgan Rogers (he/him) said:

I really do not want CT to be beached by a tech hype tide

This is just one of the first waves of a storm brewing in structured AI. The structure-first (i.e. Cat Theory) view of AI is probably correct and I think more VC type "whales" are about to beach themselves here. Let's hope we don't get too disrupted by whatever explodes out of the beached whales. This is what I was trying to allude to in this thread when I was mentioning potential AI agents entering into the zulip (originally in response to a different thread but moved by a mod). 30m is a "drop in the ocean" for AI VC funding and I can imagine some poaching of category theorists.

I wonder @Bruno Gavranović if you have any thoughts about how this community and individual category theorists can navigate the structured-AI hype train that, I feel, is about to hit CT. Maybe we can continue this conversation in that other thread or start a new one.

Morgan Rogers (he/him) (Apr 10 2024 at 15:36):

What does "VC" stand for?

Ralph Sarkis (Apr 10 2024 at 15:37):

https://en.wikipedia.org/wiki/Venture_capital

Morgan Rogers (he/him) (Apr 10 2024 at 15:44):

Thanks @Ralph Sarkis

Bruno Gavranović (Feb 01 2025 at 17:11):

@John Baez 's message in a different thread reminded me that I never posted my update with respect to Symbolica here, and realised that some people might not know that I am not working there anymore, nor that I am not endorsing the company in any way.

A copy of the message I posted in the other thread:

I just want to add a correction here: I am not employed by nor affiliated with Symbolica in any way anymore, nor do I endorse any research they claim to be doing. I was fired in December without any warning and escorted out of the office on the spot. Many other people have also been fired, or resigned voluntarily. Dominic Verity is one of them, Neil Ghani another. But this isn't limited to just category theorists - this holds for many competent ML researchers too. There's much more to say, but I was threatened legal action for talking about it publicly. So if you want to know more, feel free to reach out via DMs.

and my original post about it on twitter.

As a short personal update, I am figuring out exactly what is next for me, and hoping to have some good news to share in the near future.

Morgan Rogers (he/him) (Feb 01 2025 at 20:42):

Oh wow, that's rough. I hope that didn't put you into immediate financial difficulties, and I hope things work out for you soon!