You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
I've been learning CT from Awodey's book, and if I understand correctly, the foundations that are being used there are set theory (To be very precise, ZFC + there is a model of ZFC). In this case, the category Sets has as objects the sets contained in the model.
I also think that it's possible to use other Set theories, maybe like ETCS. I'm not sure how you'd handle the category Set
in this case, but I am assuming that you'd make a similar axiom ( "there is a category of Sets")?
However, people ask talk about CT as foundations.. I'm not sure what this is? Are people talking about a formal system for the category of all categories?
If you want to encode all of mathematics in the same language, the classical way of doing it is to use first order logic + axiomatic set theory, like ZFC.
Then you can encode category theory in this kind of language. But you need a richer axiomatic set theory because of size issues.
The approaches using category theory directly in the foundations are based on type theory. They follow the constructivist trend in the foundations on mathematics. These logical systems are also foundations for computer science at the same time. Type theories are linked to category theory in the sense that you can interpret the proofs = programs = morphisms that they encode in some kind of categorical structures.
An important point is that you can do mathematics without encoding it in a formal logical system and it works pretty well (without being absolutely perfect). In this sense, I don't think Awodey was thinking whether he's working with ZFC or something else in his book. He's explaining category theory without formal foundations, as is done most of the time.
As Jean-Baptitse said, Awodey mostly just uses naive set theory, which is to say he assumes you know what a set is "more or less", defines functions between them, and moves on to the stuff he wants to talk about. (Most math is this way.)
If you want to see category theory and ETCS developed in parallel, one axiom at a time, without being unreadably formal, Lawvere & Roseburgh's Sets for Mathematics is what I personally like. It starts out deceptively basic, but by the end you have learned ETCS, category theory at about the level of Awodey, and are then learning topos theory, with a clear sense of exactly how generalized elements are different than elements in set, etc.
(My only warning is that they use some non-standard syntax concerning quantifiers;
instead of something like
(more "standard")
If it makes sense to you, no problem! If however you find yourself scratching your head, especially when you go to distribute over quantifiers, an excellent and free resource for basic logic and set theory that uses more standard notation is Clive Newstead's book; once you understand how to negate existential quantifiers and the like, which you can learn from the relevant sections in Newstead in about a week, I expect you'll find you can figure out what L&R mean.
Thanks a lot for your responses..
Jean-Baptiste Vienney said:
The approaches using category theory directly in the foundations are based on type theory. They follow the constructivist trend in the foundations on mathematics. These logical systems are also foundations for computer science at the same time. Type theories are linked to category theory in the sense that you can interpret the proofs = programs = morphisms that they encode in some kind of categorical structures.
So do you think it's better to say that type theory, as opposed to category theory is an alternative foundation for math?
@Suraaj K S Also, to answer some of your precise questions, in the ETCS case, Set is as much only meta or only informal as in the ZFC case without a self-model: it is there only the ambient universe, as you guessed: it’s just that in the ETCS case, which is a structural set theory based on the axiomatic theory of a category itself, it is more straightforward.
As for CT as a foundation, yes a structural theory of the category of categories was tried, ETCC, but even better (less evil) is Shulman’s ET2CC, since Cat is a 2-Cat!
Suraaj K S said:
Thanks a lot for your responses..
Jean-Baptiste Vienney said:
The approaches using category theory directly in the foundations are based on type theory. They follow the constructivist trend in the foundations on mathematics. These logical systems are also foundations for computer science at the same time. Type theories are linked to category theory in the sense that you can interpret the proofs = programs = morphisms that they encode in some kind of categorical structures.
So do you think it's better to say that type theory, as opposed to category theory is an alternative foundation for math?
I think type theory as a foundation is much more common and well-known that category theory as a foundation, but cf. what @Vincent R.B. Blazy just said.
Thanks for your responses. I have a small follow-up question.
When we say type-theory as foundations, will something like Martin Lof Type theory suffice?
I ask because I keep hearing that Homotopy Type theory can be the new foundation of mathematics. However, people seem to prove almost all math theorems in Coq / Agda / Lean just fine...
Martin-Lof type theory is a common sub-system of Coq, Lean, Agda, and Homotopy Type Theory.
Coq and Agda are compatible with homotopy type theory. In Coq this is done adding an extra axiom to the system called the univalence axiom, in Agda there is a compiler flag --cubical
which turns on homotopy type theory features. Lean 3 was compatible with homotopy type theory, Lean 4 is not.
You can prove a lot of interesting math without the univalence axiom, just like you can prove a lot of math working constructively without the law of excluded middle or the Axiom of Choice. Some people assume the law of excluded middle because it squares nicely with their mental model of the universe of discourse of mathematics (whether they have one or many.) Similarly some people assume the univalence axiom for the exact same reason.
Honestly, the vast majority of mathematics can be proved in a weak subsystem of second order arithmetic. You might even say, do we really need Coq?
People who want to prove theorems in homotopy type theory are interested in the world that is described by the axioms of homotopy type theory. It's not really important for you or me to have an opinion on whether "homotopy theory will be the next foundation of mathematics". The important thing here is whether you are interested in the world described by the homotopy type theory axioms.
Patrick Nicodemus said:
Lean 3 was compatible with homotopy type theory, Lean 4 is not.
Lean 2 was the one compatible with homotopy type theory.
What happened to make Lean incompatible with homotopy type theory? Or what didn't happen that would have made it stay compatible?
John Baez said:
What happened to make Lean incompatible with homotopy type theory? Or what didn't happen that would have made it stay compatible?
The lean developpers decided to gear the development of their proof assistant towards more classically flavoured mathematics. To do so, they introduced a type (universe) of (definitionally) proof-irrelevant propositions, meaning that two proofs of a proposition are identified , and declared that identity types are always propositions (so any two proof of equality are conflated by definition), together with a substitution principle (or dependent eliminator) whose type is (a more complex version of)
Since this substitution principle can eliminate into arbitrary type families (the can be Type-valued at any universe level), one can show that any other identity type (even non-propositional) in the theory with a similar elimination principle will be isomorphic to the one we started with, and two proofs of identity for this new type will also be identified, contradicting univalence (and HoTT).
So lean (versions 3 and 4) cannot be realistically used for Homotopy type theory, but on the other hand its theory is much closer to what one can find in the internal language of a topos for instance, and the definitional proof irrelevance can be really useful in practice.
Thanks for the explanation!
I was wondering about the following question:
It seems to me that traditionally, people seem to use ZFC + Grothendieck universes as their 'ambient' mathematical foundation, and CT as a language / framework on top of this.
And these foundations are 'classical'. I was wondering if it is possible to use some kind of constructive foundation? Do a lot of things change?
Some things change significantly, while other things stay the same.
Think about the real numbers for a second. In classical mathematics one can prove trichotomy for the real numbers; i.e. that for all real numbers and , or or . However, in constructive mathematics one cannot prove this for all real numbers. This has to be assumed as a separate axiom, called the [[analytic limited principle of omniscience]] in constructive analysis. Not to mention that there are multiple notions of real numbers which are equivalent in classical mathematics but cannot be proven to coincide in constructive mathematics.
Similarly, in classical mathematics, one can prove that ordinals satisfy trichotomy. However, in constructive ordinal theory, one cannot prove trichotomy for all ordinals, and that trichotomy holds for all ordinals is equivalent to excluded middle if I remember correctly. Not to mention that there are multiple notions of ordinals which are equivalent in classical mathematics but cannot be proven to coincide in constructive mathematics.
On the other hand, most basic facts about elementary number theory and algebra and category theory are constructive and so remain the same in classical foundations and in constructive foundations.
I was wondering if it is possible to use some kind of constructive foundation?
Madeleine answered this implicitly, but in case you were still wondering: yes! Many people do, and many people here are know a lot about this. (Not me.)
Thanks for your responses!
I'm a little curious about the following fact - Suppose we use a constructive variant of ZFC (+ universes) as our foundational metalogic, and use CT as a language / framework on top. Can we have both: the category of constructive Sets, and the category of classical Sets, which are distinct?
Perhaps what I am just asking is whether we can have a model of classical set theory in a constructive metalogic....
I'm no expert, but from what I've heard the answers are yes, and yes.
Note however that using CT as your framework doesn't commit you to working constructively: you can use category theory constructively or classically (and it's not just a binary choice).
So your second question is different from the first.
@John Baez , thanks a lot for your response!
John Baez said:
You can use category theory constructively or classically (and it's not just a binary choice).
I was wondering it means to 'use CT constructively or classically'? Where can I learn more about this?
An expert would be better at references. But many sorts of mathematics can be done either classically or constructively. So suppose you are trying to study set theory in a categorical framework - or as you put it, "use CT as a language / framework on top". There are many ways to do this. But here's one aspect: you can do it classically (where you assume the law of excluded middle and the axiom of choice) or constructively (where you don't, and any existence proof gives an actual construction of the entity whose existence is proved).
You can also not assume full excluded middle or the axiom of choice, just merely enough axioms to do whatever specific sort of mathematics classically but where everything else is still constructive.
for example, the [[limited principle of omniscience]] + [[countable choice]] for real analysis.
I think Suraaj wants to know where they can learn basic stuff about, say "studying classical set theory in a constructive metalogic formulated using categories". I don't know who has written about that!
John Baez said:
I think Suraaj wants to know where they can learn basic stuff about, say "studying classical set theory in a constructive metalogic formulated using categories". I don't know who has written about that!
Mike Shulman presents various classical set theories using constructive logic in https://arxiv.org/abs/1808.05204. I don't see how the metalogic being formulated using categories vs some other framework is relevant here.
The relevance is that Suraaj wants to learn about studying set theory in a category-based framework. Luckily, the paper you pointed us to does just that.
John Baez said:
The relevance is that Suraaj wants to learn about studying set theory in a category-based framework.
Okay, but that category-based framework doesn't really occur at the level of the metalogic, but at the level of the set theory itself.
Okay, thanks. So, @Suraaj K S, here's what I think is going on in Shulman's paper. It's studying various axioms for both classical and constructive set theory, which are all phrased in the language of intuitionistic first-order logic, and comparing the resulting categories of sets. As far as I can tell, the comparison is being done in a constructive metatheory.
I just want to point out Colin McLarty's talk "Why is so much Category Theory Constructive?" https://www.youtube.com/watch?v=5JlrauplT4k I don't have a lot to add, it might spark some useful ideas.
It's been a while, and thanks everyone for their time and answers.
Is it fair to say, in some sense, that it does not matter whether your foundational metalogic is classical / constructive? It seems to me that we could study classical sets in both cases (we have categories that satisfy axioms of classical set theory), and constructive sets in both cases, and several other 'intermediate' theories too! Maybe what I am saying is that we could totally have whatever fancy internal logic / type theory we want, irrespective of what foundations we are using...
Moreover, one could argue that constructive logics are more useful than classical logics (which is true, I think, because we can extract more useful things from proofs), and thus advocate to always use a constructive metalogic?
Is it fair to say, in some sense, that it does not matter whether your foundational metalogic is classical / constructive?
It certainly matters, because there are differences between the two approaches.
Moreover, one could argue that constructive logics are more useful than classical logics (which is true, I think, because we can extract more useful things from proofs).
Here's one way in which it matters.
Isabelle/HOL can be thought of as a foundation for math.
Coq can be thought of as a foundation for math.
Set Theory (Metamath0, etc.) can be thought as a foundation of math as well.
And I think that the way one establishes the 'equivalence' of these foundations is by 'constructing models' of one foundation in another (whatever that means).
And IIRC, it is possible to construct models of foundations in foundations (I'm thinking of papers like 'Sets in Types, Types in Sets').
I'm guessing that we could construct models of Isabelle/HOL in Coq and vice versa (maybe with some axioms).
What I am curious about is the following: People say that you can 'almost always' prove everything you could in coq, in Isabelle/HOL, but the proof will be harder.
What I'm curious about is the 'almost always' part... I'm guessing that if you can construct a model (whatever that means), then it should be just 'always'...
I'm curious where my gap in understanding is?
I'm not an expert on HOL, but does it have analogue of universes?
I don't know anything about this stuff, but I assume Mike is hinting that if HOL doesn't have universes there are things you can't state (much less prove) in HOL that you can prove in systems that have universes.
Looking up HOL, I see an introduction to HOL4 that describes it as handling higher-order logic but not set theory at all. Then I see an article that says:
Among the various foundations for formal proofs, set theory on top of higher-order logic has been tried a number of times in systems such as HOLZF [42], ProofPeer [43], Egal [10], and Isabelle/Mizar [28]. This foundation is attractive for formalization, as it offers a natural mathematical foundation combined with the automation present in HOL.
The formal proof libraries of Isabelle/HOL [55] and that of Mizar [4, 16] are among the largest proof libraries in existence today. Indeed, the HOL library together with the Archive of Formal Proofs consist of more than 100,000 theorems [6], while the Mizar Mathematical Library (MML) contains 59,000 theorems. Furthermore, the results contained in the libraries are incomparable: Almost all of the Mizar library concerns itself with mathematics, while the majority of the Isabelle/AFP library are results closer to computer science [6]. For example, the Mizar library includes results about lattice theory [9], topology, and manifolds [46] not present in the Isabelle library, while the Isabelle library has many results related to algorithms not in the MML [13, 36, 37].
In our previous work [7], we have presented a model of higher-order Tarski–Grothendieck, which justifies the use of higher-order logic formalizations with set theory-based ones simultaneously. This model will allow us to combine the results present in these two major Isabelle libraries.
I don't know what "higher-order Tarski–Grothendieck" is. It looks like to develop set theory on top of HOL4 people use other systems, most notably Isabelle or Mizar. But I don't know anything about those.
I now see the article I pointed to has information about all these things.
John Baez said:
I assume Mike is hinting that if HOL doesn't have universes there are things you can't state (much less prove) in HOL that you can prove in systems that have universes.
I see. I'm definitely know nothing about all this stuff... However, I'm curious about the following question then - If Isabelle/HOL doesn't have universes, is it safe to say that you cannot construct a model of Coq (which does have universes) in it?
Mike Shulman said:
I'm not an expert on HOL, but does it have analogue of universes?
HOL (and in particular Isabelle/HOL) does not have universes. You can posit the existence of objects coding universes as an assumption in order to prove relative statements, but I think this might be pretty messy to do in practice.
Suraaj K S said:
John Baez said:
I assume Mike is hinting that if HOL doesn't have universes there are things you can't state (much less prove) in HOL that you can prove in systems that have universes.
I see. I'm definitely know nothing about all this stuff... However, I'm curious about the following question then - If Isabelle/HOL doesn't have universes, is it safe to say that you cannot construct a model of Coq (which does have universes) in it?
My understanding is that Coq's consistency strength is supposed to be roughly the same as that of CIC, and although the exact consistency strength of CIC is (I think) open, a lower bound is Z + . Isabelle/HOL's strength should be the same as Z set theory (or equivalently -order arithmetic or the theory of a Boolean topos with NNO), which is strictly weaker, so you aren't going to be able to build a model of Coq in Isabelle/HOL unconditionally.
I recently had a conversation where someone told me that you cannot write an interpreter for Agda in Agda (or Coq in Coq), and I found that a little hard to believe. A quick google search also yielded nothing. I was wondering if the statement is true? Apparently, it is because of Godel's theorems..
You can't write a program in Agda that will decide if Agda programs will halt (or Coq programs will halt), but you must be able to write a program in Agda that runs programs in Agda entered in a suitable manner, or it's not a computationally universal programming language - which would be utterly shocking, since most programming languages are.
Did you ask your conversation partner to state Godel's theorems precisely? :smiling_imp:
Agda is not computationally universal as it is a total language—however, it’s worth saying universality has little to do with what you can actually build with the language
You can write an Agda program that checks if Agda programs halt: it always returns true, as they always halt
I guess the heart of the question is can you prove - All agda programs halt
in Agda
If you can, I think you can write an agda interpreter in agda
Correct, yes: you can’t prove all Agda programs halt in Agda
In fact, there’s an ongoing project to formalise the correctness of Coq in Coq (it’s called MetaCoq) and it requires an axiom akin to “all Coq programs halt”
Nathan Corbyn said:
Correct, yes: you can’t prove all Agda programs halt in Agda
Okay. I was wondering if this is because of Godel's second incompleteness theorem?
That is, does the fact that all agda programs halt
imply agda is consistent
? Moreover, can you prove this implication in Agda?
I’m not sure Gödel’s theorems apply directly (I’m not an expert at all), but I imagine it’s a similar idea.
The fact that all closed Agda programs halt at a value means that there are no closed programs of type , because there are no values of type . (How consistency is defined.)
I’ve no idea if anyone has proven this implication in Agda for the case of Agda but you can certainly prove similar statements for weaker systems in Agda (e.g., STLC)
I see. Then I would say that Godel's second incompleteness theorem would apply here (No sufficiently powerful formal system can prove its own consistency).
Nathan Corbyn said:
I’ve no idea if anyone has proven this implication in Agda for the case of Agda but you can certainly prove similar statements for weaker systems in Agda (e.g., STLC)
I do think that the argument you spelt out is simple enough to be proven in Agda.
The incompleteness theorem and the halting problem are birds of a feather in any case. (The feather being "diagonalization".)
Nathan Corbyn said:
Agda is not computationally universal as it is a total language...
Oh wow, I didn't know anyone did anything like 'general programming' with non-universal languages, though they make sense for lots of specialized purposes. (E.g. people complain that LaTeX is computationally universal, because you may not want the typesetting of your document to halt iff some Diophantine equation has a solution.)
If you want your programming language to also be a proof assistant, where every well-typed program is a proof, then it has to be a total language, since otherwise you could "prove" false with an infinite loop:
oops : ⊥
oops = oops
Some people say, "Pacman completeness is more useful than turing completeness"
While that means the language is technically not Turing-complete, so there are terminating functions that it cannot express, in practice dependently typed programming languages are powerful enough to write almost any actually-terminating function. For instance, you can manually introduce whatever well-founded termination measure is necessary for any particular application.
Coding "Agda in Agda" and related stuff is, I think, the only real exception I'm aware of.
Of course, rewriting a function to induct over a well-founded termination measure can be tedious and obscure the important ideas. So for practical programming, Agda also allows you to give a flag to turn off the termination-checker and make it Turing-complete, either globally or locally to a particular definition.
James E Hanson said:
Suraaj K S said:
John Baez said:
I assume Mike is hinting that if HOL doesn't have universes there are things you can't state (much less prove) in HOL that you can prove in systems that have universes.
I see. I'm definitely know nothing about all this stuff... However, I'm curious about the following question then - If Isabelle/HOL doesn't have universes, is it safe to say that you cannot construct a model of Coq (which does have universes) in it?
My understanding is that Coq's consistency strength is supposed to be roughly the same as that of CIC, and although the exact consistency strength of CIC is (I think) open, a lower bound is Z + . Isabelle/HOL's strength should be the same as Z set theory (or equivalently -order arithmetic or the theory of a Boolean topos with NNO), which is strictly weaker, so you aren't going to be able to build a model of Coq in Isabelle/HOL unconditionally.
I'm a little bit curious what the various notions of comparing two doctrines/theories/logics/type theories are... That is, when does one say that one is stronger than another..
I can think of the following ways:
ZFC_proves(φ)
in PA.I'm curious whether there are other 'obvious ways' one compares one logic to another? I'd also be curious to know how these notions of comparing one logic to another relate?
If you can construct a model of T1 in T2, in the sense that using T2 as the metatheory you can build a model of T1 (for example, in ZFC the set is a model of Z), then in particular T2 proves the consistency of T1, since a theory with a model is consistent. Conversely, at least in a sufficiently powerful metatheory, any consistent theory has a model (Godel's completeness theorem), so often proving consistency is the same as constructing internal models.
As we discussed in the other thread, interpretability is more or less the same as constructing a model of T1 from a model of T2, which is different than constructing a model of T1 in a model of T2. For instance, we can construct a model of ETCS+R from a model of ZFC by taking the objects to be the sets and the morphisms to be the functions. But the entire category that models ETCS+R is not a thing in ZFC (i.e. a set) because it is a proper class (the proper class of all sets). So what this shows is that ETCS+R is no stronger than ZFC. In this case, there is also a converse construction, so the two theories are equiconsistent (of equal strength).
So it's the difference between and .
I see. Just to confirm - constructing a model of T1 in T2 is the same thing as constructing a model of T1 in a model of T2... right?
As long as the latter construction is definable.
wrt. proving consistency of proof assistant X in X can be done in an axiom-free way as long as you use a slightly weaker version of that system. One simple way of doing this is just postulating a finite number of universes, rather than the usual countable family. logrel-mltt does this for Agda with one universe, and logrel-coq for Coq. Usually, you expect these arguments to require a small number of additional universes in the meta-theory.
and yes it is exactly Gödel's 2nd theorem that would apply here
Suraaj K S said:
I'm a little bit curious what the various notions of comparing two doctrines/theories/logics/type theories are... That is, when does one say that one is stronger than another..
I can think of the following ways:
- Constructing models - If you can construct a 'model' (whatever that means) of T1 in T2, you say T2 is more powerful than T1. Here, I think that T2 will also be the metalogic
- As a (syntactic) metalogic - You could use one logic as a metalogic for another, just for 'syntactic proofs'. However, in general, this does not seem like a hard thing to do at all. I think that you could use a very simple logic like PA as a metalogic for any other logic. That is, you can encode and prove things like
ZFC_proves(φ)
in PA.- Proving consistency - If one logic can prove that another is consistent, it is stronger than the other
- Interpretations - As mentioned in https://en.wikipedia.org/wiki/Interpretability. I think that this is linked very closely with 'constructing models' (or they might be the same thing), as we discussed here: https://categorytheory.zulipchat.com/#narrow/stream/229199-learning.3A-questions/topic/On.20Categorical.20Logic/near/472077762.
I'm curious whether there are other 'obvious ways' one compares one logic to another? I'd also be curious to know how these notions of comparing one logic to another relate?
I was doing some reading, and I was wondering about a few more things along these lines. To be specific, I was curious about how one compares 'doctrines' (like equational logic, λ→), etc.
'Obviously', λ→ is 'more powerful' than equational logic. All the axioms and inference rules of equational logic are 'present' in λ→. That is, if you can do a proof in equational logic, you can do 'the same proof' in λ→. Another concept that one talks about is 'conservativity', which is roughly the converse - If you can prove an equation between algebraic terms in λ→, you can prove that in equational logic too.
I'm wondering if there are 'standard definitions' that summarize this relationship between equational logic and λ→?
We have also spoken about other ways of relating theories/logics- constructing models, interpretations, etc. I'm curious if those ideas apply here? That is, is the relation between equational logic and λ→ just a 'special case' of one of the previous ideas? To me, it doesn't look like that's the case, because we're talking about 'comparing doctrines' now. But I might be wrong...
Semantically this kind of relationship between doctrines is represented by an adjunction between their categories of models. Every model of λ→ has an underlying model of equational logic. And any model of equational logic -- which in this case we usually think of as a "theory" -- gives rise to a "freely generated" model of λ→ -- which we think of as "the same theory in an expanded doctrine", this being left adjoint to the "underlying" functor in the other direction.
Conservativity of one doctrine over another is a statement about the units of that adjunction being faithful or fully faithful. That is, if is a theory in equational logic, its freely generated theory in λ→, and the underlying theory in equational logic of that, then the unit map being faithful would mean that if two terms in become (i.e. "can be proven") equal in λ→ (i.e. in ), then they were already equal (i.e. "can be proven equal") in . Full-faithfulness would say furthermore that given a type in a context in equational logic, any term that exists in that type in λ→ already exists in equational logic.
Mike Shulman said:
Semantically this kind of relationship between doctrines is represented by an adjunction between their categories of models. Every model of λ→ has an underlying model of equational logic. And any model of equational logic -- which in this case we usually think of as a "theory" -- gives rise to a "freely generated" model of λ→ -- which we think of as "the same theory in an expanded doctrine", this being left adjoint to the "underlying" functor in the other direction.
That makes sense. Just to be clear, is 'the category of models' the same thing as 'the category of structured categories' that one talks about in the InternalLanguage/TermModel adjunction?
Yes, that's what I meant.
E.g. each CCC is a model of the doctrine STLC, as distinct from a model of a particular theory of STLC in a particular CCC.