Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: learning: questions

Topic: are proof assistants useful for CT?

Joshua Meyers (Mar 30 2021 at 15:44):

I'm getting tired of writing long boring proofs of things that are obvious (e.g. show that an object satisfies a universal property, show that two bicategories are equivalent, show that a functor is monoidal). Will a proof assistant help my situation, or will it take even longer? If it will help, which one do you recommend?

Nathanael Arkor (Mar 30 2021 at 15:47):

I don't think proof assistants help much with these tedious details yet, but it's being worked on. See this thread, for instance.

John Baez (Mar 30 2021 at 16:10):

Proof assistants are very useful for these jobs, but they're called "grad students".

Joshua Meyers (Mar 30 2021 at 16:17):

This is the kind of thing I'm looking for: https://www.cs.cornell.edu/~kozen/Papers/06ijcar-categories.pdf

Anyone know what's been going on with this sort of approach?

Joshua Meyers (Mar 30 2021 at 16:23):

OK something more recent: https://link.springer.com/chapter/10.1007/978-3-642-29485-3_9

Joshua Meyers (Mar 30 2021 at 16:46):

OK I'm going to try Nuprl unless someone has a good reason not to (e.g. it's not well-maintained). These papers make it seem like it could do the thing I want

Joshua Meyers (Mar 30 2021 at 16:58):

Or there's also MetaPRL and RedPRL which are apparently successors of Nuprl...which one do I choose?

Joshua Meyers (Mar 30 2021 at 17:12):

OK just emailed Christoph Kreitz:

Subject: Proof assistant for tedious CT computations

Hi Professor Kreitz,

I am a math PhD student at Indiana University, and I am having to do a lot of tedious category theory computations in my work. I was looking for a proof assistant that would save me some time, and I found your paper "Automating Proofs in Category Theory" and your chapter "Nuprl as Logical Framework for Automating Proofs in Category Theory", which seem to be just the kind of thing that I am looking for. So I am wondering, is it a good idea to use Nuprl for this kind of thing, or is there another proof assistant out now that would be better for this purpose (MetaPRL, RedPRL, Coq?) Thanks for any advice you can give,

Best,
Joshua

Javier Prieto (Mar 30 2021 at 19:05):

Are you ignoring more popular proof assistants because you don't think they'll be particularly useful for category theory? I know Lean has some CT in its standard library (and I'd be surprised if others didn't as well) but maybe that's not enough for your purposes

Henry Story (Mar 30 2021 at 19:24):

Proof assistants can be very useful as a library for finding out what terms mean. For example Agda Categories can be thought of as a huge executable library, that explains what terms mean by showing how they are implemented. It would be great to have a Cubical HoTT version. I think as they make progress they will be very useful to look up definitions. And also one learns by doing.

Joshua Meyers (Mar 30 2021 at 19:44):

Javier Prieto said:

Are you ignoring more popular proof assistants because you don't think they'll be particularly useful for category theory? I know Lean has some CT in its standard library (and I'd be surprised if others didn't as well) but maybe that's not enough for your purposes

No, the only reason I'm thinking about Nuprl is that I saw those papers about doing exactly what I want to do with Nuprl. Do you think that it is possible to do the same sort of thing in more popular proof assistants? I really don't know much about proof assistants, so I don't know how similar or different they are.

Fabrizio Genovese (Mar 30 2021 at 22:06):

Usually, formalizing a proof in a proof assistant is considerably harder than working on paper.

Fabrizio Genovese (Mar 30 2021 at 22:07):

What you gain is that if it typechecks you can be certain of your proof. But no, it will not make it less tedious. On the contrary, it will make it way more tedious, and you will have to spend a great deal of time to convince your compiler that some thing that you consider totally obvious is true

Fabrizio Genovese (Mar 30 2021 at 22:08):

Also, while working with proof assistants the way you formalize things in the beginning makes a great deal of difference. For instance, at Statebox we decided to formalize categories as strict (associativity and identities are equalities, not isos) and this has been a dramatically bad choice. It turns out that, at least while using dependent type theory, non-strict definitions are much more manageble.

Fabrizio Genovese (Mar 30 2021 at 22:10):

This will, presumably, force you to go back and forth many times to completely rewrite your definitions in a way that makes sense. Obviously you will find out that what you thought would make sense does not only after hours/weeks/months that you have worked on it.

Fabrizio Genovese (Mar 30 2021 at 22:11):

So, long story short: Formalizing proofs with computers is unbelievably cool and interesting for a lot of reasons, and you should definitely give it a try. But if your goal is to "get rid of the verboseness and save time" this is really, really, really the wrong choice xD

James Wood (Mar 30 2021 at 22:17):

There are in principle a lot of powerful tricks for category theory that would be facilitated by verified automation. But at least at the moment, as far as I've seen, these tend to be a little heavy to deploy, and may require learning a bit of programming language theory.

Joshua Meyers (Mar 30 2021 at 22:40):

I should be clear on what exactly I want to avoid. I want to avoid having to spend lots of time on stuff that is easy (and thus boring). If there is a way to spend about the same amount of time to get to the same place, or even a bit more (due to a learning curve), but have it be challenging, I would greatly prefer that.

Joshua Meyers (Mar 30 2021 at 22:45):

For example, if instead of cranking out proofs where I right down the obvious thing and it works, I'd rather go up one metalevel and think about strategies for coming up with "the obvious thing"

Joshua Meyers (Mar 30 2021 at 22:46):

I also do feel like in the long run this would save me time, but maybe that is just me being naive and never having used proof assistants before

Joshua Meyers (Mar 30 2021 at 22:50):

Also, I really like Lambek's approach of thinking of categorical structures syntactically (a language of objects, a language of arrows, a language of equalities between arrows), so I think what a proof assistant would expect would align with the way I already think about CT anyway

John Baez (Mar 30 2021 at 23:01):

Joshua Meyers said:

I should be clear on what exactly I want to avoid. I want to avoid having to spend lots of time on stuff that is easy (and thus boring).

That's really easy: just don't do it!

If I'm learning about something, and I don't feel like verifying some calculation, I skip it. Sometimes easy calculations are fun and/or enlightening - then I do them. Sometimes I need to do a bunch of straightforward calculations to "get something into my system" so I can prove new stuff. And when I'm writing a paper I feel a duty to check everything carefully, since it'll be coming out with my name on it. But I don't feel any duty to verify every standard fact that I'm learning about.

James Wood (Mar 30 2021 at 23:14):

Joshua Meyers said:

For example, if instead of cranking out proofs where I right down the obvious thing and it works, I'd rather go up one metalevel and think about strategies for coming up with "the obvious thing"

This sort of thing you can definitely do, and if you're wanting to learn something, then algebraic solvers are full of nice theory, and worth doing as an exercise. However, my experience is that (1) there are always more theories you want to solve than you expect, (2) solvers almost completely fail to compose, and (3) once you've done a few, developing new solvers becomes the boring work, because the process isn't as streamlined as (I think) it could be.

James Wood (Mar 30 2021 at 23:15):

This is basically what I mean by “heavy”.

Joshua Meyers (Mar 31 2021 at 00:01):

John Baez said:

Joshua Meyers said:

I should be clear on what exactly I want to avoid. I want to avoid having to spend lots of time on stuff that is easy (and thus boring).

That's really easy: just don't do it!

If I'm learning about something, and I don't feel like verifying some calculation, I skip it. Sometimes easy calculations are fun and/or enlightening - then I do them. Sometimes I need to do a bunch of straightforward calculations to "get something into my system" so I can prove new stuff. And when I'm writing a paper I feel a duty to check everything carefully, since it'll be coming out with my name on it. But I don't feel any duty to verify every standard fact that I'm learning about.

This works when I'm reading something passively, but when I want to really engage with the material --- ask whether this similar thing is true, see how I can generalize it, ask a question and try to answer it, etc. I can't just rely on my trust in the author anymore. I'd like to have a way to "explore" a topic myself that doesn't rely on tedious calculations every step of the way.

Joshua Meyers (Mar 31 2021 at 00:02):

James Wood said:

Joshua Meyers said:

For example, if instead of cranking out proofs where I right down the obvious thing and it works, I'd rather go up one metalevel and think about strategies for coming up with "the obvious thing"

This sort of thing you can definitely do, and if you're wanting to learn something, then algebraic solvers are full of nice theory, and worth doing as an exercise. However, my experience is that (1) there are always more theories you want to solve than you expect, (2) solvers almost completely fail to compose, and (3) once you've done a few, developing new solvers becomes the boring work, because the process isn't as streamlined as (I think) it could be.

Thanks!

What do you mean by "composing" solvers?
What about the process of developing solvers is not as streamlined as you think it could be?

John Baez (Mar 31 2021 at 02:03):

This works when I'm reading something passively, but when I want to really engage with the material --- ask whether this similar thing is true, see how I can generalize it, ask a question and try to answer it, etc. I can't just rely on my trust in the author anymore. I'd like to have a way to "explore" a topic myself that doesn't rely on tedious calculations every step of the way.

Yes, that's important. There are lots of ways to do that. With category theory I often explore topics by thinking about them in words using general principles. When two principles seem to conflict, or the principles I know don't seem powerful enough to guess any answer to some question, then sometimes I calculate. At that point it's often not tedious, because I'm in actual suspense about how the answer will come out! (Sometimes it still is tedious, but I try to do the bare minimum of calculation necessary to point my intuition in the right direction.)

When I say "think in words", I mean that I have 3 main ways of thinking: using sentences that I say to myself in my mind, using mental pictures, and using mathematical symbols.

Joshua Meyers (Mar 31 2021 at 02:51):

Yeah this could work, and I'll try it, but it would still be nice to be able to check whether things are true

Joshua Meyers (Mar 31 2021 at 03:10):

Anyway, could someone please tell me whether the stuff that Prof. Kreitz et al are doing in those papers is specific to Nuprl or whether it could be done in any proof assistant, and if the latter, which proof assistant they would recommend for this purpose?

Javier Prieto (Mar 31 2021 at 08:45):

I suspect the proof in the first paper you linked could be done in Lean, since the essential ingredients seem to already be part of the standard library: https://leanprover-community.github.io/mathlib_docs/category_theory/closed/cartesian.html#category_theory.curry_natural_right

You can click around to see if the definitions and lemmas you need are in there. Also note that I'm extremely biased because I only have (some very limited) experience with Lean, so I knew where to look. It's perfectly possible other proof assistants fare better in CT. If you want a somewhat idiosyncratic list of proof assistants ranked by "how much math they have formalized" I recommend https://www.cs.ru.nl/~freek/100

Matteo Capucci (he/him) (Apr 01 2021 at 08:55):

A trick to piggyback on existing proof assistants (like Agda) is to use the internal language of your categories to work in their type theory. Being careful to not use things your language hasn't (e.g. products/pairs, exponentials/arrow types), you can carry on quite a bit of proofs.
I noticed it often happens that the things you actually think as trivial become really trivial (meaning the proof assistant can figure it out by itself) when done like this.
I'm not sure how far you can go with this approach, though. I myself done very little stuff like this, but not because I encountered obstacles.

Nathanael Arkor (Apr 01 2021 at 13:43):

Being careful to not use things your language hasn't (e.g. products/pairs, exponentials/arrow types), you can carry on quite a bit of proofs.

It seems risky to me to do this for a significantly weaker theory than the "internal language of Agda", because the proof assistant may make some assumptions that are only valid in the stronger metatheory. For example, when you have exponentials, you have distributivity of products over sums, but this isn't true in general. However, it might not be obvious that you're using that fact automatically in a proof assistant, because the role exponentials play isn't completely explicit, if all you're seeing are products and sums.

Matteo Capucci (he/him) (Apr 01 2021 at 14:13):

Maybe @Christian Williams native type theory could help here?

Christian Williams (Apr 01 2021 at 15:15):

Well I think NTT produces type systems that are basically "Coq(T)", the type system of Coq/Agda parameterized by some simple type theory T. But we haven't verified this yet. Either way, not sure how it might help here.

Reid Barton (Apr 02 2021 at 15:25):

It's also very hard to avoid using universes, unless you never do any kind of abstraction or quantification over types, which is quite awkward--they are even likely to appear in the statement of the theorem you want to verify.

Reid Barton (Apr 02 2021 at 15:34):

For example: a while ago I asked here whether the free abelian group monad preserves monomorphisms in any topos. If I proved it without axioms in Agda/Coq/Lean, I would technically only know that it holds in topoi that contain a countable tower of universes.

Reid Barton (Apr 02 2021 at 15:36):

And that is technically extra power, for example, I could also prove Con(ZFC) which is a statement about a polynomial not having any solutions.

Reid Barton (Apr 02 2021 at 15:36):

(At least I could if I add choice as an axiom, I'm not sure about otherwise.)

Reid Barton (Apr 02 2021 at 15:46):

Mike Shulman (Apr 02 2021 at 23:56):

I think NTT or something like it can help. The point is to take the category you're interested in and embed it by Yoneda in a category that models all the type theory of your proof assistant, including exponentials, distributivity, universes, etc. Then whichever of those structures are preserved by the Yoneda embedding, like exponentials and products, can be used directly as operations on the "representables" inside your type theory, whereas the structures that are present in your category but aren't preserved by Yoneda, like sums, are asserted as separate structure on the representables that doesn't coincide with the structure of the ambient proof assistant type theory.

Mike Shulman (Apr 02 2021 at 23:58):

And if you have some structure that isn't preserved by Yoneda but does "act sufficiently like" the corresponding structure in a topos, then you can make it coincide with that of the embedding type theory by using a sheafified Yoneda embedding into a sheaf topos. Where of course "act sufficiently like" means essentially "defines a subcanonical Grothendieck topology", i.e. "is a postulated colimit", e.g. sums that are extensive.

Cody Roux (Apr 03 2021 at 13:48):

As a side note, I think it's worth entertaining the (very) related question, "are automatic theorem provers useful for CT"?

Given the effectiveness of tools like GAP and the power of provers like Vampire (and the reasonable effectivness of things like Djinn), I wouldn't be surprised if one could build a somewhat effective automatic tool to discharge some of these "trivial" obligations.

Though one should note that these obligations are far from trivial from an ATP perspective.

I briefly tried to build the skeleton of such a system, but got distracted. I think my friend @Philip Zucker has made a more successful attempt.

If I had but a grad student...

Philip Zucker (Apr 03 2021 at 14:19):

Thanks for the ping! Yes, I've been sort of slowly plunking away at the problem of automation for category reasoning in various forms for about a year now. I don't know that anything I've done is genuinely useful as of yet.

https://www.philipzucker.com/category-theory-in-the-e-automated-theorem-prover/ Here I tried using Automated first order logic theorem provers off the shelf to try an encode some universal properties. I'm not particularly confident my encoding is sound.

I switched tracks to instead of treating universal properties, which feel more complicated, to just equational reasoning

https://www.philipzucker.com/notes-on-synthesis-and-equation-proving-for-catlab-jl/ Using ATP like Vampire and E again. This post is mostly just junk now.
https://www.philipzucker.com/theorem-proving-for-catlab-2-lets-try-z3-this-time-nope/ Using Z3 SMT for encoding. Seemed correct, but the encoding was too heaviweight to do anything more than trivial. Perhaps with recent lessons I could go back and tune it up.

The egg paper came out https://egraphs-good.github.io/ , which describes a data structure for equational reasoning. I implemented a version here https://www.philipzucker.com/egraph-1/ and Alessandro Cheli made a better version of that here https://github.com/0x0f0f0f/Metatheory.jl . I think I have a sound system for equational reasoning in a cartesian category here, once in Julia using Metatheory.jl, once in Rust using egg.

Currently I'm trying to fix up the data structure so that it records and prints its proofs in a human readable way and auto translating the axioms sets that are contained in the Julia package Catlab https://github.com/AlgebraicJulia/Catlab.jl into my system. You may also find Catlab interesting in it's own right depending on what you want.

Having said all that, I suspect you'd find it most interesting to try and use Agda categories as mentioned above. https://github.com/agda/agda-categories No automated system as of yet gets you all that much as far as I am aware.

Jacques Carette (Apr 03 2021 at 15:07):

Note that I recently pulled in the PR from @Reed Mullanix that gives you a tactic for equational reasoning for morphism equations in categories. I'm sure that's just the start / tip of the iceberg.

Philip Zucker (Apr 03 2021 at 16:02):

Cool! I saw him tweet about this. I vaguely thought it was a port of the monoid solver so it mainly dealt with associativity and identity absorption. Any idea how likely the approach is to extend to complex queries?

Reed Mullanix (Apr 03 2021 at 20:49):

The basic idea of that solver is that we want to embed our category into a nice place where equalities hold up to beta-eta equivalence, then let agda figure out the details from there. For arbitrary categories this place is the category of copresheaves, and the embedding is yoneda. This is nice insofar that the tactic has actual categorical semantics, but is a bit underpowered when it comes to adding extra stuff ontop of it.

I think we can do something similar with Closed Monoidal Categories using a "curried" form of day convolution, but I've yet to play with it too much

Cody Roux (Apr 03 2021 at 21:06):

Nice! So some kind of NbE?

Cody Roux (Apr 03 2021 at 21:06):

I've often wondered what one could generalize this to, in terms of which equational theories.

Reed Mullanix (Apr 03 2021 at 21:27):

Yeah, it is a bit mysterious. For instance, I _dont_ think it would work with any old Monoidal Category, as the day convolutions doesn't seem to melt away in the same way that the curried version would. This is all speculation though!

Reed Mullanix (Apr 03 2021 at 21:29):

To elaborate, you would end up with this giant pile of tensors that you would have to sift through, whereas you can shift some stuff around when the category is closed monoidal to get rid of all of the tensors that might show up

James Wood (Apr 03 2021 at 22:03):

Cody Roux said:

I've often wondered what one could generalize this to, in terms of which equational theories.

Yeah, there's interesting variety in NbE-style solvers. Sometimes, you have to make extra use of the fact that expressions are given syntactically, so only finitely many elements have to be considered. The simplest example would be commutative monoids, where expressions in n variables are interpreted into Vec ℕ n (for each variable, how many times it occurs in the expression). That is, you don't use a general free commutative monoid, but rather a specific, data-like, free commutative monoid on finite sets.

Cody Roux (Apr 04 2021 at 02:32):

Right, but that's asking for a lot: I don't expect a general purpose solution to be clever about commutativity unless I treat that specially.

Cody Roux (Apr 04 2021 at 02:32):

I'd expect things to be similar to how we handle completion in TRS, since that's what I know about.

Morgan Rogers (he/him) (Apr 04 2021 at 09:25):

Reed Mullanix said:

For arbitrary categories this place is the category of copresheaves, and the embedding is yoneda.

Huh? Why copresheaves?

Reed Mullanix (Apr 05 2021 at 01:08):

Presheaves could work too, copresheaves just happen to work out a bit nicer for programming reasons.

Morgan Rogers (he/him) (Apr 05 2021 at 09:27):

But then Yoneda is contravariant... I'm curious about the programming reasons. Is it just that defining C^op in order to construct presheaves is more effort than just working with functors on C because you already have C?

Reed Mullanix (Apr 05 2021 at 16:43):

So the tactic works by first reifying some series of morphisms composed together into a tree, like so:

  data Expr : Obj → Obj → Set (o ⊔ ℓ) where
    _∘′_ : ∀ {A B C} → Expr B C → Expr A B → Expr A C
    id′  : ∀ {A} → Expr A A
    [_↑] : ∀ {A B} → A ⇒ B → Expr A B

As a note on notation, A ⇒ B is notation for a morphism between A and B

We then turn this reified tree of morphisms into an action on homs:

  embed : Expr B C → A ⇒ B → A ⇒ C
  embed (f ∘′ g) h  = embed f (embed g h)
  embed id′ h       = h
  embed [ f ↑] h    = f ∘ h

This is the "programming reason". We could have it be:

  embed-contra : Expr C A → A ⇒ B → C ⇒ B
  embed-contra (f ∘′ g) h = embed-contra g (embed-contra f h)
  embed-contra id′ h = h
  embed-contra [ x ↑] h = h ∘ x

But the traversal is a lot less natural

Full code is here if you want to take a peek :)
https://github.com/agda/agda-categories/blob/master/src/Categories/Tactic/Category.agda

Cody Roux (Apr 05 2021 at 22:11):

Neat! I learned this trick from Peter Dybjer's slides

Reid Barton (Apr 05 2021 at 22:16):

We use a related trick in the Lean category theory library: if we have a lemma of the form $f \circ g = h$ that we want the simplifier to apply, even if it appears in some tree of compositions in which $f$ and $g$ are not necessarily siblings, we automatically generate a second lemma of the form $f \circ (g \circ k) = h \circ k$ , which will always apply after moving all parentheses to the right

Reid Barton (Apr 05 2021 at 22:20):

It's also turning a system of compositions-in-a-category into a "native" composition $(f \circ -) (g \circ -) k$ .

Jon Sterling (Apr 07 2021 at 21:04):

Mike Shulman said:

I think NTT or something like it can help. The point is to take the category you're interested in and embed it by Yoneda in a category that models all the type theory of your proof assistant, including exponentials, distributivity, universes, etc. Then whichever of those structures are preserved by the Yoneda embedding, like exponentials and products, can be used directly as operations on the "representables" inside your type theory, whereas the structures that are present in your category but aren't preserved by Yoneda, like sums, are asserted as separate structure on the representables that doesn't coincide with the structure of the ambient proof assistant type theory.

Indeed! And for others in the room, this is in essence also the method that we used to prove metatheoretic results such as normalization for cubical type theory, except that we pass through not only the Yoneda embedding but a further open embedding until we reach a topos whose 'internal type theory' can state the properties we want to prove.

Cody Roux (Apr 09 2021 at 18:13):

@Jonathan Sterling I deeply want to ask some follow-up questions about this, but I feel the thread is not the right one for this.