Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: theory: type theory & logic

Topic: Curry-Howard and Machine Learning


view this post on Zulip Samuel Gélineau (Jan 23 2024 at 13:28):

Somebody posted a correspondence table between Logic, Type Theory, and Machine Learning. It seems wrong to me. What would the correct table look like?

Here's the post which looks incorrect to me, but I don't recommend using it as a starting point, I would prefer to focus on the core concepts: Proposition, Proof, Conjunction, Disjunction, and Implication.

view this post on Zulip Samuel Gélineau (Jan 23 2024 at 13:28):

https://twitter.com/attunewise/status/1736072333780201526?s=20

view this post on Zulip Kevin Carlson (Jan 23 2024 at 16:57):

Why would there be a correct table? The curry Howard correspondence is actual math, not just vague analogies, and I don’t see why you’d expect any more than the latter with ML.

view this post on Zulip John Baez (Jan 23 2024 at 17:38):

The ML column in that table involves a lot of linear algebra. There is a pretty solid relationship between logic and programming (at least insofar as they are about cartesian closed categories) and linear algebra (at least insofar as it's about a symmetric monoidal closed category, namely Vect). I wrote about it in my "Rosetta Stone" paper with @Mike Stay. But I'm not claiming this sheds any light on machine learning.

view this post on Zulip Samuel Gélineau (Jan 24 2024 at 02:22):

Kevin Arlin said:

Why would there be a correct table? The curry Howard correspondence is actual math, not just vague analogies, and I don’t see why you’d expect any more than the latter with ML.

I have joined this Zulip via a lecture series called "Categories for AI", and I am only now realizing that this Zulip is about category theory in general, not about the link between category theory and machine learning.

I did not understand everything in that series, but the impression I got was that the essence of neural networks was now understood in the language of category theory. Since I know (from the Rosetta Stone paper quoted above, even!) that the Curry-Howard correspondence extends to category theory, I expected the correspondence to extend to neural networks, via that category theory formalization of neural networks.

I am guessing that in the same way that the Curry-Howard correspondence maps logic to type systems like STLC's, not Java's, I expect the extended correspondence to map logic and type systems to constructs in some essence of machine learning, not to some constructs in ChatGPT.

view this post on Zulip Mike Stay (Jan 24 2024 at 16:36):

The ML column of that chart looks very ad-hoc to me, more like symbolism than functoriality. What you ought to see in that column is the linear logic version of the first column (specifically, multiplicative intuitionistic linear logic where tensor and par are identified). All the linear algebra used in machine learning is done in a finite-dimensional Hilbert space, and the "Physics" column of our table is really about Hilbert spaces:

image.png

Since all morphisms in FDHilb have a conjugate transpose, contraction is fundamentally the same thing as cloning quantum states, which is forbidden. So there isn't a linear algebra notion corresponding to contraction. If you add an exponential comonad, you can include linear functions into all functions and then contract, but that's not what he put in the ML column, either.

On the other hand, machine learning isn't entirely linear algebra: there are plenty of nonlinear activation functions involved.

image.png

What's the class of functions you get when you include these? There are theorems like the Universal approximation theorem that give some conditions under which a neural network can learn a function. But I haven't seen anyone do the work to define the actual set of functions learnable in a particular framework and build a symmetric monoidal closed category out of it.

view this post on Zulip Mike Stay (Jan 24 2024 at 16:48):

And of course, none of this is to say that people haven't done good work modeling machine learning using category-theoretic concepts, like using lenses & optics to model back propagation.

view this post on Zulip Chris Barrett (Jan 25 2024 at 06:48):

Relevant paper: https://discovery.ucl.ac.uk/id/eprint/10068974/

view this post on Zulip Jean-Baptiste Vienney (Jan 25 2024 at 19:02):

You can look at cartesian reverse derivative categories introduced in this paper: reverse derivative categories (Cockett, Cruttwell, Gallagher, Pacaud Lemay, MacAdam, Plotkin, Pronk). It's a categorical axiomatization of the reverse derivative used in machine learning.

view this post on Zulip Jean-Baptiste Vienney (Jan 25 2024 at 19:05):

There is much more work than this on category theory and machine learning. I've heard that optics are usefull. This paper seems to be a good overview (it mentions in particular reverse derivative categories and optics): Category Theory in Machine Learning(Shiebler, Gavranović, Wilson).

view this post on Zulip Jean-Baptiste Vienney (Jan 25 2024 at 19:08):

By the way, I don't completely agree with the second message posted here. Curry-Howard is a real correspondence, but it is rarely stated as "actual math" with a precise isomorphism. The most important is to understand the intuition I think.

view this post on Zulip Jean-Baptiste Vienney (Jan 25 2024 at 19:15):

It is often the case in computer science, that things can be encoded in different equivalent ways, like the notion of computable function, which can be defined using Turing machines, recursive functions... Compared to mathematics, we have a bit less of a canonical language.

view this post on Zulip Chris Grossack (they/them) (Jan 25 2024 at 20:13):

One thing that's tangential, but which people in this thread might be interested in: You can embed terms from the lambda calculus (and thus turing machines) into linear logic, which you can then interpret in the category of vector spaces. This gives a way of doing "gradient descent" directly on turing machines!

See the talk here: https://www.youtube.com/watch?v=IW4LjjAWrO4

view this post on Zulip Matteo Capucci (he/him) (Jan 26 2024 at 17:06):

:O wow this is such a cool idea!