You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Suraaj K S10:48 AM
I've been trying to learn a little bit about how category connects to logic and type theory.
Background wise, I'm familiar with category theory at the level of Awodey / Leinser's book, and axiomatic set theory (ZFC). I think that it's safe to assume that I know very little about type theory.
Here are some books/ resources I came across:
I am sure that there an many other resources.
Something that I was quite surprised about was that these books seemed to give 'start off' with things which are quite different:
I was wondering how similar / different all these approaches are?
Is there a 'standard' approach that people use, and which book / resource would be a good starting point for that?
How to these other approaches (maybe using Multicategories) relate to the standard approach?
I guess which reference you're looking for depends on what, precisely, you mean when you say "logic".
I'm going to recommend "Introduction to Higher Order Categorical Logic" by Lambek and Scott. If any book is a standard reference in categorical logic it is that one. I think it should be pretty approachable given your background.
The book spends a lot of time on the correspondence between the simply-typed lambda calculus and cartesian closed categories. I think if you're looking to learn some basic type theory this would be a good way to do it.
@Chad Nester , thanks for your response. I'd still be a little curious as to why so many different approaches are used, and whether all of them are different ways of looking at the same thing..
This is naturally going to be a bit sloppy because of space constraints, but here's the general idea. As a tl;dr, yes I think these people are all more-or-less talking about the same thing.
First, we have to realize there are multiple interesting "logics" one can study. There's a LOT to say here, but one quick observation is that "weak" logics can't say much, but what they can say is true in a lot of settings, whereas "strong" logics can say lots of things, but what they can say only matters in more limited circumstances.
For instance, if you're working with "equational logic", where all your axioms have to be equations between terms, you can interpret models inside any category with finite products. This gives you access to models in LOTS of settings (which is freeing!), but you can only express equational things (which is limiting!). Contrast this with, say "geometric logic", where your axioms can be much more flexible, so you can express lots of interesting properties (which is freeing!), but now you only get models inside topoi (which is limiting).
How does this work? In broad strokes, you think of an object in your category as a "context". That means the expressions your write down are allowed to have a variable "of type ". A colloquial example might be
"let and . Then ."
The context is " and ". It tells you what variables you have access to. Then in this context we're able to ask the question , which has some truth value. We can interpret the context (with two variables of type ) as the object in our category. We think about and as being two arrows , and we want these arrow to be equal (which we can express by the commutativity of a diagram).
This brings us to the first thing you explicitly asked about, namely "multicategories". Instead of working with in order to get an arrow with multiple inputs (read: a context with multiple variables), it would be nice if we natively had access to arrows with multiple sources in the definition of a category. This is because things in the definition of a category are the only things guaranteed to be strict, and this allows us to sidestep some potentially annoying "coherence issues" that are a frequent problem in categorical logic. A multicategory is just a category where we allow arrows to have multiple inputs.
How, I hear you asking, do we interpret more general propositions? Well, we assign to every type (read: object) a lattice of "propositions on ". These are supposed to be "compatible" in the sense that whenever we have a map and a proposition , we should get a proposition on defined by (we say this is defined "by pullback"). This data is exactly a fibred category, where the fibre over in your category is exactly the lattice , and the arrows in the total category come from these pullback operations.
To connect this with topoi, every topos has a natural subobject fibration, where we assign to the object the lattice of subobjects of . This functor is represented by the "subobject classifier" , which is (part of) why you see the subobject classifier get so much attention.
Lastly, you ask why category theory is "naturally 2-categorical". Here Evan is getting at Lawvere's functorial semantics. Going back to equational theories because they're simpler, every theory has a "classifying category"
so that models of in some category are exactly functors from ! But the models are only half the story, where are the homomorphisms? It turns out that homomorphisms are exactly natural transformations between these functors!! So in order to get the whole category of -models in , you need access to the full 2-categorical structure of categories, functors, and natural transformations.
So we see that these topics are all closely related, but they solve different problems that arise in the field.
I hope this helps ^_^
I was reading the wikepedia article on Categorical Logic: https://en.wikipedia.org/wiki/Categorical_logic.
It says that there are 3 'themes' to categorical logic:
I was wondering if these three themes are 'independent', or interconnected?
A brief look seems to indicate that at least the first 2 (Categorical semantics and Internal languages) are quite independent ideas..
For instance, Mike Shulman's notes: https://mikeshulman.github.io/catlog/catlog.pdf seem to be about the second theme (type theories as internal languages). Moreover, searching for 'functorial semantics' in the pdf yields nothing!
On the other hand, Steve Awodey's notes: https://awodey.github.io/catlog/notes/ seem to be about the first theme.
I don't really know what the third theme means, but it looks like it should just be part of the first theme, as indicated by the phrase
In many cases, the categorical semantics of a logic provide a basis for establishing ...
So, am I right in assuming that these are two separate ideas, and Categorical Logic is in some sense, an overloaded term?
"Introduction to Higher Order Categorical Logic" by Lambek and Scott is my fav and best book ever!
'Logic' is an overloaded term. Categorical logic is at least cutting it down to a subset where categories are used in more-or-less essential ways :-)
What about Suraaj's first question: are "categorical semantics," "internal languages" and "term-model constructions" independent or interconnected?
I know nothing about the third, but I'd hope that at least the first two are deeply connected. For example geometric functors between topoi are part of categorical semantics, while every topos has its own internal language. But these aren't independent ideas: a geometric functor should map models of geometric theories formulated in the internal language of to those formulated in the internal language of . Or something like that - maybe an expert could say this more accurately.
re: the third, you could call this "initial model constructions". the reason computer scientists call those "term models" is that the most direct route to construct them is from the "terms" (syntax) of the logic. Anyway, in e.g. equational logic and many others you can always find a distinguished initial model of a theory via a universal property etc. In fact in CQL we compute these all the time.
Thanks! If "term models" are initial models constructed synactically, which sounds completely believable, then these are definitely closely connected to the theme of internal languages.
So now I will claim all 3 topics in the Wikipedia article on categorical logic are tightly connected, and the article should be rewritten to explain how.
Suraaj K S said:
I was reading the wikepedia article on Categorical Logic: https://en.wikipedia.org/wiki/Categorical_logic.
It says that there are 3 'themes' to categorical logic:
- Categorical Semantics
- Internal Languages
- Term-model constructions
I was wondering if these three themes are 'independent', or interconnected?
A brief look seems to indicate that at least the first 2 (Categorical semantics and Internal languages) are quite independent ideas..For instance, Mike Shulman's notes: https://mikeshulman.github.io/catlog/catlog.pdf seem to be about the second theme (type theories as internal languages). Moreover, searching for 'functorial semantics' in the pdf yields nothing!
On the other hand, Steve Awodey's notes: https://awodey.github.io/catlog/notes/ seem to be about the first theme.
I don't really know what the third theme means, but it looks like it should just be part of the first theme, as indicated by the phrase
In many cases, the categorical semantics of a logic provide a basis for establishing ...
So, am I right in assuming that these are two separate ideas, and Categorical Logic is in some sense, an overloaded term?
Check out my suggested readings.
I would say that the "categorical semantics" of a logic consists in describing a category of structured categories that is related to the category of theories in that logic by an adjunction, where the two functors in the adjunction are the "internal language" of a category and the "term model" of a theory.
Nice, that ties it up neatly.
Mike Shulman said:
I would say that the "categorical semantics" of a logic consists in describing a category of structured categories that is related to the category of theories in that logic by an adjunction, where the two functors in the adjunction are the "internal language" of a category and the "term model" of a theory.
Thanks a lot! That seems concise enough to be put on a bumper sticker ;)
I was wondering if there are books / resources which mention this? IMHO, my ideal book on Categorial Logic would mention this at the very beginning, and the rest of the book would be various examples of this adjunction...
Also, am I right when I think of the 'classic' 'equivalence' of Cartesian Closed Categories and STLC as an adjunction as mentioned above (with CCCs being the 'structured categories', and the category of theories in that logic being the category of STLC theories)?
My unfinished notes, which you mentioned in the original post, are certainly written from that point of view, although I didn't emphasize that particular terminology.
One thing that's confusing is that we sometimes sloppily say that a logic or type theory X is "an internal language for" structured categories Y when they are related by a categorical-semantics adjunction, but that's a different meaning of "internal language" than when that phrase describes the right adjoint in the adjunction.
Also, it's worth noting that many people (but not me) would prefer that the adjunction is an equivalence.
I see. Thanks a lot for the explanation.
I am a little curious about the following sentence -
Mike Shulman said:
the "categorical semantics" of a logic consists in describing a category of structured categories that is related to the category of theories in that logic .
Here, examples of 'logic' would be equational logic, STLC, FOL, etc., right?
And are 'theories in that logic' simply a pair consisting of a signature and a set of axioms. For instance, ZFC is a 'theory' of FOL. (I'm curious what the morphisms between two theories are...)
One also talks about 'models' (sometimes called algebras?), but I guess that the category of structured categories takes this role? That is, for each 'theory' we have a category of models (which is an object of the category of structured categories...)
Yes, those are examples of what I meant by "logics". Whatever we call it, the class of objects in question also includes type theories (like STLC and MLTT); sometimes these are also called "doctrines" or "2-theories". Note that in general each such "logic" is multi-sorted.
And yes, a signature and a set of axioms is basically what I mean by a "theory" in a logic. Defining the morphisms between theories is part of where you get into the question of whether the adjunction is an equivalence. I would choose a morphism of theories to consist of maps of signatures and axioms, sending each base type of to a base type of , each function symbol of to a function symbol of , and each axiom of to an axiom of . But there are successive weakenings of this where you first allow axioms of to go to theorems of rather than axioms, then allow function symbols to go to arbitrary terms, then allow base types to go to arbitrary types. The fully relaxed version is what tends to get you an equivalence between theories and categories, but it remembers much less about what distinguishes a theory from its term model.
As for models, if is a theory and is a structured category, then a model of in is a structured functor , or equivalently a theory morphism . (In the latter case it doesn't really matter which morphisms of theories you take, since theories of the form are "saturated" in that every theorem is an axiom, every term is represented by a function symbol, and every type is isomorphic to a base type.) Although the choice of is usually better since then you get without any extra work the whole category of models, since a category of structured categories is actually a 2-category, whereas the category of theories is less obviously anything other than a 1-category.
Thanks for the explanation!
In Jacob's book: https://people.mpi-sws.org/~dreyer/courses/catlogic/jacobs.pdf, I think what's happening is that what we call "logics" / "doctrines" , etc. seem to be split into a 'logic' part and a 'type theory' part. For instance, in section 0.1, we have "a logic is always a logic over a type theory". Moreover, it seems like we can 'mix and match' different 'logics' and 'type theories'(according to Jacob's) to form different 'doctrines' (the previous discussion).
Moreover, in that book, we have the following sentence: "Categorically this will correspond to one ("total") category, capturing the logic, being fibred over another ("base") category, capturing the type theory".
I was wondering how this 'fibred' category maps to our previous discussion where we have:
Is the 'fibred' category simply a particular object in Str?
Yes, that's right. Sometimes a "structured category" means a category together with a fibration over it, or even something more like two fibrations, etc. In that case the corresponding doctrine has two classes of types, one or more of which might be called "propositions", and the part of the doctrine pertaining to terms belonging to "propositions" is traditionally called the "logic".
Thanks a lot for your patience and response! I hope this was not too annoying.
Mike Shulman said:
In that case the corresponding doctrine has two classes of types, one or more of which might be called "propositions", and the part of the doctrine pertaining to terms belonging to "propositions" is traditionally called the "logic".
This reminds me a little bit about how we have Prop
and Set
, etc. in Coq. I was wondering if this is related?
Mike Shulman said:
a signature and a set of axioms is basically what I mean by a "theory" in a logic
Hopefully, this is my last question (for now ;)). I'm guessing that a 'signature' here is a collection of sorts and first order function symbols (with constants being a special case).
I also think CT cannot handle 'higher order' function symbols, which (I think) was the point of the discussion here: https://categorytheory.zulipchat.com/#narrow/stream/229199-learning.3A-questions/topic/Did.20Category.20Theory.20catch.20up.3F/near/453035014...
Suraaj K S said:
This reminds me a little bit about how we have
Prop
andSet
, etc. in Coq. I was wondering if this is related?
Yes, it's related. It's not exactly the same because Coq is a dependent type theory so that everything is all mixed up, with Prop
and Set
not being judgment forms but types themselves. But if you replace A : Prop
with a judgment A prop
and so on, you get the judgmental structure of a fibred logic of this sort, as in chapter 4 of my notes.
I like the fibrational perspective because I think toposes can sometimes bake in too much of a tight connection between propositions and types. For example, I don't think in the setting of a fibration modelling higher order logic, we have (for with )
Whereas this is true in a topos.
To me, this seems like not something you would not to follow automatically. Toposes are a little bit too strong in that regard.
I love fibrations, but I don't really intuit why one might expect your example to ever fail to be true.
As for the meaning of "signature", in general it depends on the doctrine. A signature for a doctrine (which I would prefer to define in general to include the "axioms", since axioms are "really" just higher-dimensional generators) should consist of "generators" for all the sorts of the doctrine. So, for instance, in STLC, a signature consists of generating types, generating terms, and generating equalities (= 2-cells = relations between terms).
A "generating term" is also known as a function symbol, which has types as its input(s) and output(s). But you can make different choices about whether those input and output types must be generating types or arbitrary types. If they are required to be generating types, then the function symbol is "first-order", whereas if they are allowed to be arbitrary types (and the doctrine contains higher-order type-formers like ) then the function symbol is "higher-order". There's no problem with higher-order function symbols or their semantics in categories with higher-order structure.
I didn't read that other thread carefully, but I gathered the point was that there's no good notion of non-invertible morphism between models of a higher-order theory due to contravariance. This seems to me like just a fact, and I'm not quite sure why one would think that there should be such a notion. In any case there's no problem with higher-order function symbols in a doctrine that has higher-order type formers, although the construction of the adjunction is somewhat more involved (it goes like the one for -presentations in section 1.7.3 of my notes).
Mike Shulman said:
I love fibrations, but I don't really intuit why one might expect your example to ever fail to be true.
Hahahaha, yes, I agree it is a bit common sense, it's not that I have a counterexample semantics in mind that is persuasive to me, it's just that I don't like the proof and the way it relies on theorems about topos theory, the fact that it fails in a general higher order fibration makes me suspect that there's some blurring of logic and semantics going on in the topos theoretic proof where we "cheat" by passing to the semantics to establish what should be a logical theorem.
I am sure there are formulations of intuitionistic higher-order logic that are complete for topos theory, but since this fails in fibrations, it seems that completeness for the topos theoretic semantics is not robust with respect to small changes in the proof calculus for intuitionistic higher order logic.
Well, personally I think a better internal language for toposes is dependent type theory. You can more or less code all of mathematics in IHOL, but there are parts of it (like category theory!) that are much more natural in DTT. And in DTT, your example is provable.
I think one thing (perhaps the main thing?) that fails in a general higher-order fibration relative to a topos, and which is related to your example, is function comprehension, a.k.a. unique choice: from we can't deduce the existence of a function such that . Again, this is a pretty unintuitive failure, and the most common thing we do with a higher- order fibration that fails it is construct a category in which it does hold, e.g. the tripos-to-topos construction.
Thanks a lot. This has been very enlightening.
I guess my initial confusions arose because I thought there was a 'unique way' to do Categorical semantics for a logic, stemming from 'cool CT things' like the 'Curry-Howard-Lambek' correspondence, etc. where people say that STLC and CCCs 'are the same thing'. I guess that I thought that every doctrine (eg. STLC) would be 'the same as' some structured category (eg. CCC). However, it seems like it's not the case. It seems like we could choose some other Structured Category - For instance, it seems like we could use both Fibred categories as well as Topoi as our 'structured categories' for some doctrine (Idk which doctrine was being discussed exactly. Or maybe the point is that they work for all of them...). Now, I'm curious about things like soundness/completeness though.. Do we only have 'completeness' when the adjunction is an equivalence?
Completeness, in the categorical sense, basically follows from the construction of the term model. If something is true in all structured categories, then it's true in the term model, and therefore it's provable in the theory since the term model is constructed so that the things true in it are precisely those provable in the theory. (Classically, completeness sometimes refers more specifically to completeness relative to one specific structured category such as Set; that's a much more special property that only holds for some doctrines and theories.)
It's true in general that a single syntactic doctrine can correspond to more than one kind of structured category. Patrick and I were talking about the syntactic doctrine of Intuitionistic Higher-Order Logic, which has semantics both in a kind of "higher-order fibration" and also in toposes. I'm not sure how special of a situation this is.
In a certain sense the higher-order fibrations are the more "natural" semantics of IHOL, since they represent its judgmental structure more directly, e.g. they have two classes of objects corresponding to the two judgments "is a type" and "is a proposition" of IHOL. It's usually possible to write down a "natural" semantics of any syntactic theory in this way by simply translating judgments to objects and morphisms. But if you take it seriously, sometimes what you get isn't what most people think of! E.g. the "natural" semantics of STLC is not really cartesian closed categories but rather cartesian closed multicategories. Of course CCCs and CCMCs are "equivalent" in a certain sense.
On the other hand, for highly structured categories like toposes and LCCCs, there doesn't seem to be a syntactic doctrine for which they are exactly the "natural" semantics. So instead we pass through an intermediate structure, like a higher-order fibration or a "category with families", that is the natural semantics of a doctrine like IHOL or DTT.
And my last two messages are a bit contradictory: completeness is only really "automatic" in the case of the "natural" semantics, since that's the case when the term model can be constructed directly out of the judgments in the theory. So Patrick's point was in general well-taken, that for, e.g., the semantics of IHOL in toposes, there may be things that hold in all topos models but aren't provable syntactically. IIRC I think this doesn't precisely happen for IHOL and toposes, because the things that hold in all toposes but not all higher-order fibrations (like Patrick's example) aren't expressible internally in the syntax of IHOL; but even if I'm right about that, it's sort of an accident.
(Or, one might say, a consequence of the very special nature of toposes.)
(Or, another one might say, the impoverished syntax of IHOL.)
Mike Shulman said:
Completeness, in the categorical sense, basically follows from the construction of the term model
Just to confirm, this is the sentence which is not exactly correct? Completeness 'basically follows' only when we have the 'natural semantics'?
Yeah.
Another thing that I think happens is that these 'doctrines' can be programming languages too, and you can similarly give categorical semantics to programming languages too? And in many cases (I think), that these PL 'doctrines' are lousy logics because they are inconsistent (because nonterminating programs are 'proofs' of ⊥)? I think this book: https://www.cambridge.org/core/books/categories-for-types/258841251C62FED1DACD20884E59E61C takes that view?
Yes, many doctrines have a computational interpretation, and so many programming languages have corresponding doctrines. And yes, if the programming language is Turing-complete, then every type in the doctrine will be inhabited by some term, which means that the corresponding categories can't have strict initial objects the way topos-like categories usually do. Instead, the corresponding categories are those arising from domain theory. As you say, it's hard to use a naive propositions-as-types interpretation of logic when all types are inhabited, but there may be things you can do.
Why does Turing completeness imply inhabitedness of all types? If I consider the unityped lambda calculus, which is Turing complete, and add a second type with no term formers, the calculus can still compute every Turing-computable function, so I would have thought this would be considered Turing complete without having all types inhabited.
Usually you add a fixed point combinator of type fix_X : (X->X)->X and then every type would be inhabited by "fix id", or show that such a construction can be emulated already (in the case of the untyped lambda calculus the Y combinator plays the role of fix for example). It's an interesting exercise to write a term of type "Void" in Haskell for example.
Yes, I was thinking of Turing-completeness at every type, according to which the empty function is one of the partial functions that would be required to be represented. I guess it's true that you could have a language that's Turing-complete at some types and not others.
Ryan Wisnesky said:
It's an interesting exercise to write a term of type "Void" in Haskell for example.
Do you mean in some more complicated way than
?
that's indeed what I meant; the pedagogy is in discovering why you can't write that same definition in e.g. Coq, how recursion is implicit in Haskell, and how axioms such as letrec can lead to terms in types that at first appear to have no constructors (such as Void).
Ah, yes. Sorry, it sounded to me like you were suggesting that writing such a term required using a fixpoint combinator.
I'm a little curious if / how Logical Relations might fit into this framework?
Logical Relations basically consist of interpreting a theory in a special sort of diagram/glued category. I wrote an old blog post about it from that point of view.
Mike Shulman said:
Logical Relations basically consist of interpreting a theory in a special sort of diagram/glued category. I wrote an old blog post about it from that point of view.
I see. I was wondering if this also follows the 'general shape' of categorical semantics (with a category of structured categories, a category of theories, and an adjunction?), where these 'diagram/glued categories' are the 'structured categories'?
The diagram/glued category (for a particular shape of logical relations) is one of the structured categories.
Mike Shulman said:
The diagram/glued category (for a particular shape of logical relations) is one of the structured categories.
Oh yes. Okay, that makes sense...
I'm also guessing that we can use the same ideas with "fancier" type theories / doctrines like HoTT?
On a related note, if I understand correctly, we have an adjunction between the category of infinity groupoids (the category of structured categories), and the theories of MLTT?
Yes, you can do the same thing with HoTT and so on, but there are two different ways to do it. On one hand, you can work with 1-categorical presentations of structured higher categories like categories with families, and then you can expect an ordinary 1-adjunction just like for ordinary type theories. On the other hand, you can work with a higher category of structured higher categories, but then things get a bit more fraught and you may need to invert some "weak equivalence" theory morphisms.
For MLTT, the structured categories are "directly" categories with families with the appropriate type structure. The more indirect version is locally cartesian closed -categories (for intensional MLTT with funext), although technically that is still a conjecture. The -category of -groupoids is one such structured category.
Mike Shulman said:
On one hand, you can work with 1-categorical presentations of structured higher categories like categories with families, and then you can expect an ordinary 1-adjunction just like for ordinary type theories. On the other hand, you can work with a higher category of structured higher categories, but then things get a bit more fraught and you may need to invert some "weak equivalence" theory morphisms.
I was wondering if there are pros / cons of both approaches? The first does seem simpler / easier
I'm curious if 'theories' of doctrines 'can be' docrines themselves?
For instance, in this thread: #learning: questions > Higher-order logic as a 'theory' of STLC , we discuss how HOL is a theory of STLC (the doctrine). Being a logic, would one consider HOL a doctrine as well?
I guess that an extension to that question would be - can something like Set Theory (maybe ZFC / ETCS), which is a theory of first order logic, considered a doctrine as well?
Do we have a 5 stars hard old book on doctrines?
Suraaj K S said:
I was wondering if there are pros / cons of both approaches? The first does seem simpler / easier
Sorry, I guess I missed this question at first. The first is indeed easier, and in fact most current ways to do the second actually go through the first, by establishing an equivalence between an -category of structured -categories and a localization of some 1-category of CwF-like objects and then applying the first approach to the latter.
One disadvantage of the first approach is that it requires using a coherence theorem to strictify your structured higher categories into an appropriate kind of CwF. It's not always clear how to do this, or if it's even possible (e.g. strictifying univalent universes was an open problem for a while). Moreover, such strictification theorems generally rely heavily on a set-theoretic metatheory (meaning ZFC or ETCS, but not HoTT). Of course, insofar as one does the second approach via the first approach, it inherits the same problem, but at least in theory one might imagine a version of the second approach that doesn't require strictification.
One disadvantage of the second approach is that it's less clear how to formulate it as an adjunction rather than an equivalence, because syntactic theories naturally form only a 1-category, and we can't really expect to have an "internal language" functor from an -category of structured -categories to a 1-category of theories. If we invert some theory weak equivalences we can make an -category of theories, but this seems to usually end up equivalent to that of structured -categories.
However, all of this is pretty speculative, since the second approach hasn't been pursued very much so there isn't much that we can say about it for sure. Perhaps if someone takes it seriously, they'll find that I'm totally wrong. (-:
Suraaj K S said:
I'm curious if 'theories' of doctrines 'can be' docrines themselves?
For instance, in this thread: #learning: questions > Higher-order logic as a 'theory' of STLC , we discuss how HOL is a theory of STLC (the doctrine). Being a logic, would one consider HOL a doctrine as well?
One sense in which theories can be doctrines is with a dimension shift. A doctrine (= 1-doctrine) is the same as a 2-theory, which lives in some 2-doctrine (= 3-theory). And so a 1-theory can be considered a 0-doctrine, in which we have 0-theories. For example, if we regard the 1-theory of groups as a 0-doctrine, then the 0-theories in that 0-doctrine are (presentations of) particular groups.
However, I don't think this is what you have in mind. It sounds to me as though what you're thinking of is closer to a "co-slice doctrine". For example, if we consider the semantic doctrine of CCCs (whose syntactic doctrine is STLC), then in that doctrine we have the theory of group objects. Semantically, this is a particular CCC that is freely generated by a group object, so that CCC morphisms are equivalent to group objects in . Now if we consider the co-slice 2-category , we can regard this as the semantic doctrine of "CCCs containing a group object". It has a corresponding syntactic doctrine, which is obtained from STLC by adding the constants and axioms of the syntactic theory as rules to the doctrine.
The description in the other thread of "HOL as a theory of STLC" seems to be thinking of something like this, of obtaining the doctrine HOL by adding the rules relating to to the doctrine STLC, and thinking of those rules as a "theory" in STLC. I don't think this is quite correct, though, because the the quantifier rules like have to be parametrized over arbitrary types, and that's not something you can do inside a theory. So I would say rather than HOL is an extension of STLC obtained by adding one base type and several rules relating to it.
Suraaj K S said:
I'm curious if 'theories' of doctrines 'can be' docrines themselves?
I thought about this question a lot. in its time, many people asked various versions of the question:
what follows from the fact that category theory streamlines mathematical theories into corresponding universes (category of sets, category of groups, etc) and category theory itself is a mathematical theory that can be streamlined like that? what does it mean logically that category theory itself satisfies the axioms of a 2-category?
lawvere tried to axiomatize that. street had his theory of cosmoi, etc. in your terms, if we formalize the theory of doctrines as 2-monads, is there a 2-monad of doctrines?
the answer that mike shulman sketched is that there is a dimension shift: the "doctrine" of 2-monads would be a 3-monad. which is clear already on the basic case of "category" of categories being a 2-category.
BUT I think that this shift is a consequence of asking the question in terms of algebraic theories. category theory is essentially algebraic and categorical algebra covers a great part of categorical practice, so it seemed natural to ask the question in terms of monads and 2-monads. but in in higher-order logic, all theories become algebraic...
i think it is worth dwelling on the question as it stands BECAUSE if we find a format of categorical logic where category theory is an instance of a categorical theory THEN category theory becomes a programming language, in the sense that the logical theories are programs, and any formalization of category theory itself within category theory is a universal interpreter of that programming language.
i think THAT point of view displays what we actually do when we do category theory.
the success of machine learning and the realization of AI that we witness these days shows that our logical practices in general are based on such universal interpreters in various forms at various levels of language... (I can say more if it is of interest.)
one example of a categorical programming language is the theory of sketches. there is a sketch of all sketches, and a sketch of their models...
Mike Shulman said:
Well, personally I think a better internal language for toposes is dependent type theory.
lawvere's logic for toposes was the language of hyperdoctrines. note that the top of the doctrinal diagram is dependent type theory, or at least the sums and the products as adjoints of the substitution. the bottom is quantifiers-as-adjoints. with all predicates at the bottom being substitution instances of Truth, the doctrinal diagram gives Girard's F-omega. with the comprehension adjunctioin between the top and the bottom, lawvere's doctrinal diagram describes a type theory equivalent with coquand-huet's calculus of constructions.
it is interesting that his equational hyperdoctrines put a sort of a straitjacket on identity types. he did not pursue the question of relating the equality predicate that he defined using the existential quantifier and the leibniz equality, defined using second-order universal quantifier. that is what thomas streicher did, and they found the groupoids :)
Mike Shulman said:
The description in the other thread of "HOL as a theory of STLC" seems to be thinking of something like this, of obtaining the doctrine HOL by adding the rules relating to to the doctrine STLC, and thinking of those rules as a "theory" in STLC. I don't think this is quite correct, though, because the the quantifier rules like have to be parametrized over arbitrary types, and that's not something you can do inside a theory. So I would say rather than HOL is an extension of STLC obtained by adding one base type and several rules relating to it.
This makes sense, and I think this is what I thought too... I guess I was confused when this book: https://mitpress.mit.edu/9780262133210/foundations-for-programming-languages/ (specifically example 4.4.7) which explicitly says, "we can present higher-order logic as a λ→ theory". The 'signature' of this theory has infinitely many types of the form ∀<σ>, apparently..
It's not just about having infinitely many operators, it's that you want to have a for all types, and you can't say that in a theory. Even if you add a operator to the theory for each specific type that exists in the theory, it won't follow that a model of that theory has a operator for every type (object) that exists in the model.
Mike Shulman said:
It's not just about having infinitely many operators, it's that you want to have a for all types, and you can't say that in a theory. Even if you add a operator to the theory for each specific type that exists in the theory, it won't follow that a model of that theory has a operator for every type (object) that exists in the model.
I see.. I think that the book was maybe just showing a 'cute' syntactic trick, where you can 'encode' and 'do' HOL in STLC...
Mike Shulman said:
Logical Relations basically consist of interpreting a theory in a special sort of diagram/glued category. I wrote an old blog post about it from that point of view.
Another thing that I seem to keep coming back to is this discussion. It does seem like CT 'can handle' logical relations and parametricity.
However, https://www.cs.bham.ac.uk/~udr/papers/logical-relations-and-parametricity.pdf seems to talk about how CT 'cannot handle' parametricity, and we need a CT 2.0.
We have had a thread of discussion on this in the past, but I'm a little bit confused as to what the resolution was... Is it safe to say that we do have a satisfactory answer to the questions put forth in the paper? Or does the paper actually demonstrate a 'real' limitation that exists even today?
I think that's an exaggeration of what that paper says. For one thing, they describe in that very paper two ways in which category theory can handle parametricity, and their reflexive-graph categories are pretty much the same as one of the diagram categories I referred to.
That makes sense.
I think that I was confused by some other comments made by the paper's author here: https://cs.stackexchange.com/questions/9769/what-is-the-relation-between-functors-in-sml-and-category-theory and https://cs.stackexchange.com/questions/9818/what-is-meant-by-category-theory-doesnt-yet-know-how-to-deal-with-higher-order, the specific comment being "Category Theory doesn't know how to deal with higher-order functions. Maybe someday, it will"
Yeah, those comments are pretty nonsensical.
I was reading https://plato.stanford.edu/entries/goedel-incompleteness/.
In the article, there seems to be a particular section about 'the method of interpretations' (particularly section 1.2.2). I was wondering if what is being referred to there was simply the idea of 'constructing a model' of one theory in another?
There also seems to be a notion of interpretability (https://en.wikipedia.org/wiki/Interpretability), which seems to be another way of relating a theory to another.
I was wondering if there is some connection between the two notions?
Moreover, I'm also curious what notion the article is referring to?
I'm not sure exactly what that SEP article is referring to, but it could be either "constructing a model of one theory from a model of another" or the sort of "interpretation" referred to on that WP page. However, there is a close connection between the two: if you can construct a model of T from any model of S in a "uniform" or "elementary" way (I'm not sure how to make that precise), then that construction also gives a way to translate formulas in T to formulas in S by sending an T-formula to the S-formula "the T-model constructed from the present S-model satisfies ."
Indeed, the semantic notion (constructing a model of S inside a model of T in a 'nice' way) and the syntactic notion (map each relation symbol of S to a formula in T preserving satisfaction) agree. The idea is just we interpret each atomic formula in one language as a formula in another, which gives a "uniform" choice of definable set in each model. So what the SEP article is talking about is the same notion as in the wiki article. See for instance 1.3 of Marker's Model Theory or 2.6 of Hodges's Model Theory. Note that an interpretation of S into T gives a functor Mod(T) to Mod(S) between the categories of models (see "Glymour & Quine on theoretical equivalence" and "Morita equivalence" by Barrett and Halvorson).
Thanks! I guess there is some way of making "in a nice way" precise?
Also, I would be inclined to talk about constructing a model of S from a model of T rather than "inside" it, since the latter sounds to me like doing it internally in such a way that T would prove the consistency of S. Do model theorists use "inside" for the former instead?
My understanding (which might be wrong!) is that they tend to use "inside" in that way because the notion of interpretation really does give relative consistency. If there's an interpretation of S in T, then if T is consistent, S is consistent too. I'm not sure about the best way to say "in a nice way" purely semantically; it really means: interpret each relation symbol as a definable subset and extend in the usual way by induction. So maybe "definable interpretation" is a better description of what's going on.
Ah, yes, "definable interpretation" makes sense.
To be clear, there seem to be two ways of 'comparing' one theory T with another theory S.
ZFC_proves(φ)
. Is this right?If what I said was correct, then it seems to me that the second way (Interpretations) should be much easier than the first. All you need to do is 'hack up/encode' the proof system of one theory in another. Moreover, I think that it would always be possible to use this method of interpretations for any theory in PA by using things like Godel encodings.. Is this correct? Or is the wikepedia article mentioning yet another way of comparing one theory with another..
No, I don't think that's what's meant by interpretations. You seem to be basically describing using one theory as the metatheory for another. I think "interpretation" means instead that both theories are in the same metatheory, and you construct a translation from formulas of one to formulas of the other (or from models of one to models of the other, in the other direction).
There's a standard concept of 'interpretation' in (classical, non-category theoretic) model theory, described here:
There must be many other concepts called 'interpretation'....
Mike Shulman said:
The description in the other thread of "HOL as a theory of STLC" seems to be thinking of something like this, of obtaining the doctrine HOL by adding the rules relating to to the doctrine STLC, and thinking of those rules as a "theory" in STLC. I don't think this is quite correct, though, because the the quantifier rules like have to be parametrized over arbitrary types, and that's not something you can do inside a theory. So I would say rather than HOL is an extension of STLC obtained by adding one base type and several rules relating to it.
I was doing some reading and found this answer on the internet: https://cstheory.stackexchange.com/a/47313/73876. I was wondering if what you meant was also something similar?
Yes, that's what I was thinking "HOL as a theory of STLC" meant, and what I was responding to. But Andrej generally knows what he's talking about, so if he's willing to call it a theory then I wonder whether I'm missing something.
Mike Shulman said:
Yes, that's what I was thinking "HOL as a theory of STLC" meant, and what I was responding to. But Andrej generally knows what he's talking about, so if he's willing to call it a theory then I wonder whether I'm missing something.
Something worth mentioning might be that the answer does talk about both 'a theory of STLC' as well as 'a slight extension of STLC'...
When one talks about STLC (and how its equivalent to the category of Cartesian Closed (Multi) Categories), you assume 0,x,→ as the type constructors.
However, in introductory PL courses, we are usually taught λ→, that is, just with function types. (I am still talking about the typed λ calculus here.)
I was wondering how / if λ→ is treated in categorical logic. Something that seems to bug me a little bit is that you need products in your categories to define exponentials, and λ→ does not have product types..
The categorical counterpart of λ→ is closed cartesian multicategories. That is, a [[cartesian multicategory]] that is closed in the multicategorical sense that it has internal hom-objects with a universal property
In generaly, multicategories are the context in which you can define exponentials without products, corresponding to how syntax can have multiple variables in the context even if there aren't product types.
Note that we can also define exponentials without there being cartesian products, using representability instead
This should coincide with the approach via closed cartesian multicategories: since is cartesian closed, it is in particular a closed cartesian multicategory, and we can take the full sub cartesian multicategory spanned by representables. The condition for to exist in the category (without assuming products with exist in ) will be the condition that exists in the cartesian multicategory . In other words, every category is the underlying category of a canonical cartesian multicategory, and asking for exponentials in the representable sense is the same as asking for exponentials in the canonical cartesian multicategory.
I would say it's a particular case of closed cartesian multicategories. A general closed cartesian multicategory won't be "canonical" in that sense; in fact it seems rare to me that it would be except in the case when it also has finite products.
are you saying there are closed cartesian multicategories that are not equivalent to that canonical one given by Arkor using representable presheaves?
Yes. The main "naturally occurring" one that I can think of is the syntactic category of λ→, in which the objects are types and the multi-morphisms are terms in a finite context. I don't see any reason why for types , every natural transformation would be represented by a term .
Note that here "" is the representable presheaf on the underlying ordinary category, so it only knows about terms in length-1 contexts.
what if you took presheaves on the category of contexts instead?
The category of contexts in λ→ is a CCC.
So, yes, its cartesian multicategory structure is the canonical one, as it is for any category with finite products.
In categorical logic / type theory, the judgements seem to be of equality. For example, you have the beta equality rule in STLC, etc.
On the other hand, for computation, you want rules for directed reduction. For instance, you have a beta reduction rule in STLC too. These reduction rules need to 'play well' with the equality rules, and are often directed versions of equality rules, but not always (for example eta reduction).
I'm wondering if there is some 'method to this madness'? Something like, given a type theory with equality judgements, can one automatically derive 'the right' reduction rules?
Oh, would that that were true! Finding a good way to "orient" the equality rules to make them reduction steps is one of the trickiest, if not the trickiest, parts of turning a "declarative" type theory into a computational one. And proving that such a choice "works", which goes by names like "canonicity" and "normalization", is generally one of the more difficult and substantial metatheorems to establish about a new type theory.
People have established good intuitions, heuristics, and general techniques that help a lot. But there's no automatic way to produce reduction rules from equality rules; and if you consider the space of all possible "declarative type theories", the ones that admit a computational interpretation are, I think, going to be quite sparse. It's just that those tend to also be the most interesting and useful ones, so we're more familiar with them. That's not entirely a coincidence, but it still feels kind of magical to me on some days.
Mike Shulman said:
Oh, would that that were true! Finding a good way to "orient" the equality rules to make them reduction steps is one of the trickiest, if not the trickiest, parts of turning a "declarative" type theory into a computational one. And proving that such a choice "works", which goes by names like "canonicity" and "normalization", is generally one of the more difficult and substantial metatheorems to establish about a new type theory.
I see. That makes sense..
I'm curious whether 'canonicity' and 'normalization' are synonyms? From what I understand:
m → m'
, then m=m'
. That is, reduction 'agrees with' equality. For instance, the beta reduction and beta equality rules in STLC agree.Your property that "If m → m'
, then m=m'
" is what I would call "soundness" of the reduction system: it doesn't violate the judgmental equality. That's sort of a "bare minimum" requirement of any notion of reduction to be sensible.
Normalization is, roughly, about the converse. It says that there is an algorithm that computes from any term (in any context) a "normal form" (which we often think of as another term, although it doesn't have to be) with the property that two terms are judgmentally equal if and only if their normal forms are exactly equal. This allows you to algorithmically test judgmental equality by reducing both terms to normal form and then checking them syntactically for equality. The traditional sort of normalization algorithm consists in applying primitive "reduction steps" one by one until the algorithm terminates (termination being part of what one has to prove), but more modern treatments tend to use a fancier sort of algorithm called "normalization by evaluation".
Canonicity is technically a statement only about the judgmental equality, and is stated for a particular concrete type like the natural numbers: it says that for any term t
in the empty context belonging to the natural numbers type, there is a term c
in canonical form -- which in the case of the natural numbers means it is a "numeral" of the form S(S(S(S...(S(0)))))
-- and a judgmental equality t=s
.
However, usually we ask for an algorithm that computes c
from t
, and this algorithm is usually very similar to the normalization algorithm. Specifically, if every normal form belonging to the natural numbers in the empty context is a numeral, then canonicity follows from normalization. (However, you can do things like add arbitrary axioms, which preserve normalization since they are equivalent to prepending variables to the context, but do not generally preserve canonicity.) On the other hand, while canonicity obviously does not actually imply normalization, and is often significantly easier to prove, in practice it's quite hard to think of type theories that satisfy canonicity but not normalization, although usually canonicity is proven first since it's easier and to point the way towards how to prove normalization.
That makes sense!
Does one think of things like 'observational'/'contextual' equivalence too, when doing categorical logic? Or is that 'just' a programming language thing?
Also, does one consider the description of 'what normal and canonical forms are', a part of the core type theory? or are they 'add-ons' which are useful for proving things about the core type theory?
Suraaj K S said:
Does one think of things like 'observational'/'contextual' equivalence too, when doing categorical logic? Or is that 'just' a programming language thing?
For the most part, "algorithmic" equality isn't relevant to categorical logic, except insofar as it coincides with judgmental equality. Some people have tried to formulate categorical structures that "see" it, such as 2-categories whose 2-cells represent reductions, but this is generally in the vein of "how can we come up with a categorical description of the algorithmic behavior" rather than the reduction steps meaning anything directly in naturally-occurring categories.
Of course, when normalization holds, then algorithmic equality does coincide with judgmental equality, and this is very useful for coding proof assistants since it gives a way to test equality algorithmically. In a dependent type theory, testing equality is essential to having a decidable typechecking algorithm. So for practical use in a proof assistant, this sort of thing is important, even if it isn't needed for the mathematical theory of categorical semantics.
Suraaj K S said:
Also, does one consider the description of 'what normal and canonical forms are', a part of the core type theory? or are they 'add-ons' which are useful for proving things about the core type theory?
I think most people would consider them add-ons, but I'm not positive. They certainly are closely associated with the theory.
Suraaj K S said:
Also, does one consider the description of 'what normal and canonical forms are', a part of the core type theory? or are they 'add-ons' which are useful for proving things about the core type theory?
I think most people would consider them add-ons, but I'm not positive. They certainly are closely associated with the theory.
When you say algorithmic equality, do you just mean reduction rules made into an equivalence relation?
When I was talking about 'observational' equivalence between t1 and t2, the notion is more about whether a context can distinguish between t1 and t2. Here, a context is a program with a 'hole' in it.
Let's take the untyped lambda calculus for example. We can say that t1
is contextually equivalent to t2
iif for all contexts c, c[t1]
terminates iif c[t2]
terminates.
Yes, I'm aware that people study "contextual" equalities like that, but that's not what I was talking about. I meant the equality given by a normalization algorithm (which, again, in a modern context might not be simply repeated applications of reduction rules). I thought that's what you meant too since you said up above
You could take the previous idea further, by using reduction to define a new notion of equality called contextual/observational equality.
Mike Shulman said:
I meant the equality given by a normalization algorithm (which, again, in a modern context might not be simply repeated applications of reduction rules)
Oh wow, I did not know that! I always thought that the reduction rules were the algorithm..
Reduction rules give a "small-step operational semantics", from which you can also extract a "big-step operational semantics". However you can get a big-step operational semantics by other means. "Normalization by evaluation" means extracting a big-step operational semantics from a denotational semantics!
James Deikun said:
"Normalization by evaluation" means extracting a big-step operational semantics from a denotational semantics!
While true, I feel this is a bit misleading, because it makes it sound magic, whereas in fact I think all the work of constructing a normalization algorithm is just moved into constructing a denotational semantics. (-:
Although maybe that would be obvious to a computer scientist.
Here's another question that I was wondering about:
In a few introduction to CT courses / books (for example: Awodey, https://www.cs.uoregon.edu/research/summerschool/summer23/_lectures/Introduction_to_Category_Theory_notes.pdf, etc.), a connection between monads and algebras is introduced. If I understand correctly, an 'algebraic theory' (which is traditionally a signature and a set of axioms) 'is just a monad'... For instance, the last section in the linked OPLSS lecture notes talks about this.
What I'm curious whether there is a link to the style of categorical logic as we were discussing in this thread (an adjunction between a category of theories and a category of structured categories)?
Are you thinking specifically of relating these notions of "theory"?
Mike Shulman said:
Are you thinking specifically of relating these notions of "theory"?
Yes. I think that would be a good start...
I was just wondering if the monads as algebraic theories idea was a 'neat' idea, or whether it was 'just' an instance of some deeper concept in categorical logic
In categorical logic, a model is a functor in the category of structured categories.
On the other hand, an algebra over a monad T is an object A, along with a morphism a : TA → A, subject to a few conditions
Here's another monad related question:
In categorical logic, we have an adjunction between
We also discussed that doctrines could be 'programming languages' too, and 'models' give a denotational semantics of programming languages. For instance, I think that https://www.cambridge.org/core/books/categories-for-types/258841251C62FED1DACD20884E59E61C talks about such categorical semantics for programming languages.
What I was wondering about is the following: Moggi's seminal work (https://www.cs.cmu.edu/~crary/819-f09/Moggi91.pdf) gives a 'monadic' semantics for programming languages with effects. I was wondering if this would fit cleanly in the framework of categorical semantics as we have been discussing so far?
For instance, I'm curious what the 'category of structured categories' would be in this case?
To the second question, if the doctrine involves the monad as an explicit operator, then the "structured categories" would be categories equipped with a monad.
Otherwise, it might be something like a [[Freyd category]].
To the first question, there's some overlap in the notions of "theory" but they're not identical. The theories in the doctrine of finite products are multi-sorted, finitary algebraic theories, while monads on Set are single-sorted, infinitary algebraic theories. The intersection consists of single-sorted finitary algebraic theories.
A multi-sorted theory does also induce a monad, but rather than on Set, this monad is on the category , where is the set of sorts (i.e. base types).
I don't know of a direct categorical connection between the category of finite-product theories and the collection of monads on powers of Set, although I suppose there probably is one.
And for other doctrines, there isn't much of a connection to monads in general.
Mike Shulman said:
To the first question, there's some overlap in the notions of "theory" but they're not identical. The theories in the doctrine of finite products are multi-sorted, finitary algebraic theories, while monads on Set are single-sorted, infinitary algebraic theories. The intersection consists of single-sorted finitary algebraic theories.
I see. If I understand correctly, these 'theories' are objects of the 'category of theories of the doctrine', right? And you'd have an adjunction with some other 'category of structured categories' in both cases?
Moreover, could we call 'monads on Set' a doctrine? I guess that the sentence says that the theories of this doctrine are single-sorted, infinitary algebraic theories
. What'd be the 'category of structured categories' in this case? Furthermore, could we have models in categories other than Sets?
I'm sorry for the barrage of questions :(
The multi-sorted finitary algebraic theories are the objects of the category of theories of the doctrine of finite products, yes. But single-sorted theories don't really fit well into the doctrinal framework. You can, of course, consider the full subcategory of the category of multi-sorted theories determined by those that have only one sort, but that category doesn't have an adjunction to a category of structured categories, since the "internal language" theory of a structured category is multi-sorted (its sorts are the objects of the category).
In general a doctrine specifies the kinds of operations that are allowed on types and terms, and a theory in that doctrine gives particular generating types, terms, and axioms. So single-sortedness is really a property of a theory rather than an aspect of a doctrine.
Also I wasn't very precise in saying "infinitary". In general a monad can have "infinitary operations" but I don't think there is an explicit notion of "infinitary theory" that's equivalent to a monad on Set. There is a notion of [[infinitary Lawvere theory]] but it's not well-understood as far as I know, in particular it isn't known whether every such gives rise to a monad, or whether every monad arises from such a thing.
What you can say is that for any regular cardinal , there's a notion of -ary algebraic theory, and the single-sorted ones of those are equivalent to -accessible monads on Set. But there are also inaccessible monads on Set, like the one whose algebras are compact Hausdorff spaces.
Mike Shulman said:
I don't think there is an explicit notion of "infinitary theory" that's equivalent to a monad on Set. There is a notion of [[infinitary Lawvere theory]] but it's not well-understood as far as I know, in particular it isn't known whether every such gives rise to a monad, or whether every monad arises from such a thing.
Perhaps I'm misunderstanding something, but I would have thought an "infinitary Lawvere theory" is precisely what Linton calls a "varietal theory", i.e. an identity-on-objects product-preserving functor from , and these are trivially equivalent to monads on .
(Of course, characterising the presentations that correspond to varietal theories is not at all straightforward.)
Ok, that's what I thought 14 years ago, but then Toby Bartels disagreed and I never really paid attention to the discussion that followed, and the resulting nLab page is ambiguous. If you're confident about that, maybe you could edit the nLab page?
But in any case, you're right that it doesn't really help with the connection to doctrines, since the "local smallness" condition isn't syntactic.
Nathanael Arkor said:
Perhaps I'm misunderstanding something, but I would have thought an "infinitary Lawvere theory" is precisely what Linton calls a "varietal theory", i.e. an identity-on-objects product-preserving functor from , and these are trivially equivalent to monads on .
I’ve not read the discussion Mike is alluding to but my understanding was: this would only cover the single-sorted case (maybe that’s fine) and also (I think) requires choice to present each set as isomorphic to a small coproduct of a fixed singleton set (edit: so that coproducts of these presentations are strictly associative). I’d thought the point was that it is not enough to merely have small products but we also need to _choose_ them for the notion of model. If this is accurate, the usual notion of (single-sorted) Lawvere theory would more closely correspond to a product preserving identity-on-objects functor from the opposite of the category of cardinal numbers and functions between them. The multi-sorted case would then follow by taking powers.
Every set is canonically and constructively isomorphic to a small coproduct of copies of a fixed singleton: . That doesn't require any choice.
Sorry, I meant in such a way that e.g., coproducts of these presentations are strictly associative
I don't understand, why would you need that?
I’d thought it would be essentially the same reason that a Lawvere theory is defined as an IOO functor out of the _skeleton_ of finite sets (op’d): you have a single representative of each arity and structure is preserved strictly.
(Not that I necessarily think these are good reasons, I was just trying to make sense of what was on the nLab page in terms of my understanding of the finitary case)
Ah. I don't think that's necessary in the finitary case either, it just makes some people happier to have only one object of each arity rather than a bunch of isomorphic copies. I think the fact that you'd need AC to get it in the infinitary case is a good reason to drop that requirement, as the nLab definition does.
That makes sense
Perhaps I'm missing part of the discussion, but wouldn't complete Boolean algebras be considered an infinitary algebraic theory? Those don't come from a monad; in fact the left adjoint to the forgetful functor to sets doesn't have a left adjoint (free complete Boolean algebras do not generally exist; they exist only on finite sets).
I personally would refer to that as a "large algebraic theory", but reserve "infinitary algebraic theory" for the notion corresponding to arbitrary monads on Set. But different authors have made different terminological choices.
There are four notions (in the one-sorted case for simplicity):
Obviously any (1) generates a (2), and we can think of (2) as the "extensional essense" of a (1). And every (4) generates a (3), namely the opposite of its Kleisli category, while there is an argument that every (3) induces a (4) that it would be nice to have spelled out in more detail (or perhaps it's in some reference, maybe the Linton mentioned above?). Complete Boolean algebras are a (1)+(2) but not a (3)+(4).
So what do we call what? I can see arguments for either (2) or (3) being an "infinitary Lawvere theory"; (2) is the more straightforward modification of the usual notion, but lacks the correspondence to (4) that's an essential part of the theory of ordinary Lawvere theories. I do see a pretty strong argument for an "infinitary algebraic theory" being a (1), since "algebraic theory" suggests to me something that can be presented syntactically, and there's no clear syntactic counterpart of the local smallness condition in (3). But, as Nathanael said, different people may make different choices.
or perhaps it's in some reference, maybe the Linton mentioned above?
(Yes, that's right.)
I was reading the appendix of https://mikeshulman.github.io/catlog/catlog.pdf.
There is the following sentence: "Similarly, an equality judgment like x : A |- M ≡ N : B is shorthand for (x.M) ≡ (x.N) : (A |- B)".
If the notes are taking the view that terms are 'just' annotations, I'm a little confused as to what exactly the real judgement is? Are they the sorts (which are defined to be judgements) of the form Σ^≡ (as shown in A.3)? Or are they 'simpler' sorts? I think I'm a little bit confused when 'term annotations' are introduced...
I've also never seen deductive systems presented this way. I was wondering if this is standard? or something novel?
I wrote in A.4 that a "judgment" is a sort in one of the signatures in a dependency sequence, which is therefore defined in some unspecified way from the initial algebras of the previous ones. In the case of something like , so that we're talking about "equality judgments", these sorts are pair of "derivations" of some other judgment in a previous signature, so that they could potentially be equated. For example, in the case of categories, we start with a signature whose sorts are written , and so the sorts of are triples where and are derivations of . The perspective I'm taking in that appendix is that and are just notations for two such derivations and .
I wouldn't say this presentation is "standard", as I don't think I've seen it written down like that before. But I don't think it's all that novel either; what I wrote was an attempt to describe what I had learned (or thought I learned) about how some type theorists think about what's going on in many cases, even if they aren't that precise about it.
Mike Shulman said:
sorts of Σ≡ are triples (f,g,(A⊢B))
I think that to match the notes exactly, they need to be ((A⊢B),f,g)?
However, I think I understand it a little better now
Also, I was curious if there were plans to extend those notes to include dependent type theories and polymorphic type theories?
It depends. Does a "plan" have to be time-bound? (-:O
I've a small question about 'term systems' as defined by those notes (Definition A.5.2):
It says that a "term system for Σ is a Σ-algebra...". However, there is an example in the previous page where the 'term system' for a cut-ful type theory of categories (where the sorts are pairs of objects in the category) is a single sorted free algebra of untyped terms. I'm wondering if I'm missing something here?
The example is also followed by a parenthetical that says
For technical reasons,
rather than a single set of terms as here, in the general case we will allow each
judgment of our intended theory to be assigned a different set of “potential
terms”; see below.
Mike Shulman said:
The example is also followed by a parenthetical that says
For technical reasons,
rather than a single set of terms as here, in the general case we will allow each
judgment of our intended theory to be assigned a different set of “potential
terms”; see below.
I think I was just wondering as to whether the single sorted term algebra example 'could be adapted' to the multisorted case.
Well, I didn't develop any of the examples very precisely, but I think what I had in mind was that (in the absence of binding) to define a term system, you would first define a single-sorted signature, usually with one operation corresponding to each (typed) operation of , consider the initial algebra for that signature, and then use that same set as the (possibly ill-typed) terms at all judgments.
In the presence of binding, I would probably take the possibly-ill-typed terms to nevertheless be well-scoped, so that for a judgment with a context of length n the terms would be those with n free variables.
This is mentioned later in section A.6:
in a simple type theory (chapter 2) the terms for should have
variable bindings outermost, where is the length of (this is why in §A.5 we
allowed different judgments to have different sets of potential terms).
Mike Shulman said:
Well, I didn't develop any of the examples very precisely, but I think what I had in mind was that (in the absence of binding) to define a term system, you would first define a single-sorted signature, usually with one operation corresponding to each (typed) operation of , consider the initial algebra for that signature, and then use that same set as the (possibly ill-typed) terms at all judgments.
And is the idea that you then give a Σ-algebra structure on this (family of sets which are really the same set)? The notes say that a "term system for Σ is a Σ-algebra"...
I think that the appendix is quite interesting because I really haven't seen precise definitions of what is 'deductive system' is. And we seem to use them all over the place: logics, programming languages (type judgements, operational semantics, and even syntax!)
Yes, that's right, you give a (non-free) -algebra structure on that family of sets, using the way you chose the single-sorted signature for which the set of terms is the free algebra.
Often, these 'inference rule systems' also have inference rules with 'side conditions' which can be semantic. Would it be safe to say that the deductive systems in the notes don't handle those systems?
What sort of conditions are you thinking of?
Here's an example: https://www.cs.cornell.edu/courses/cs6110/2023sp/lectures/lec08.pdf. In section 3.1, you have the side condition -- (where n2 is the sum of n0 and n1)
Moreover, in https://www.cs.cornell.edu/courses/cs6110/2023sp/lectures/lec06.pdf, section 5, formula (4) is one definition of what an inference rule is..
Suraaj K S said:
Here's an example: https://www.cs.cornell.edu/courses/cs6110/2023sp/lectures/lec08.pdf. In section 3.1, you have the side condition --
(where n2 is the sum of n0 and n1)
I would fit that in my framework by saying that there are actually an infinite family of judgments , one for each external integer , and what looks like one rule with a side condition is shorthand for infinitely many rules, one for each triple of external integers with .
There are certainly different formal definitions of "deductive system" that one might give, and the one you cited is indeed different from mine. That one appears to be focused on inductive definitions of subsets of some previously fixed set, whereas mine is focused more on inductively defining new sets. With the latter end in mind, I needed to insist that the entire structure be algebraic, which means there are no "side conditions" in a formal sense, although as in this case many "side conditions" can be reexpressed as talking about parametrized families of judgments and rules.
Another thing that I'm noting (which might be incorrect) is that while the definition of deductive systems in your notes seems to restrict the kinds of systems you can build, you can always go overboard and define whatever you want. What I'm thinking might be better expressed using an example like the follows:
Let your Σ1 be a signature with one sort and a single nullary operator unit
. WΣ1 has only one element.
I think you could define Σ2 with the following sorts - each sort is a pair <r, unit> where r is a real number and unit is the element of WΣ1. Now, your set of sorts is isomorphic to the reals.
Is there anything that prevents me from defining such a 'weird' deductive system?
Nope! That's perfectly fine.
I remember feeling a sense of shock when I first realized real vector spaces are algebras of a Lawvere theory. That was a case of having uncountably many operations rather than sorts.
I remember feeling a similar sense of shock when I read about infinitary logic, which allows certain formulas of infinite length, e.g. conjunctions and disjunctions of up to many formulas.
My feeling was something like "hey, syntax isn't supposed to be so big!"
Of course, if one actually wants to implement a type theory in a proof assistant or as a programming language, then the syntax has to be finitely generated.
Another way that the notes seem to differs from typical descriptions of type theories is as follows (I think)...
I think that one traditionally thinks of judgements of the form λx.x : Nat → Nat
, the notes above would treat the judgement as just Nat → Nat
, and λx.x
is a 'term'. I guess you could also have another deductive system (as defined in the notes) where the judgements are really 'traditional judgements' too?? Going a little further, maybe you even have some freedom / cleverness in actually choosing what part of a 'traditional judgement' you consider a 'term' and the rest is the judgement..
I don't know what's "traditional", but it's true that I made a choice to regard terms as notations for derivations of judgments as much as possible, rather than regarding typing judgments as a property of terms. My reason for that is that it makes it easier to define categorical semantics by induction over derivations. However, as you say, the formal setup of my Appendix A works just as well for a deductive system with typing judgments as a property of terms.
I see. I also think that, in your framework, if you think of things like λx.x : Nat → Nat
as judgements (which are sorts), then you'd end up having at most one operator for each sort (where the sort appears as the 'target' of the operator), which is ?trivial? and maybe not that useful..
John Baez said:
I remember feeling a sense of shock when I first realized real vector spaces are algebras of a Lawvere theory. That was a case of having uncountably many operations rather than sorts.
I remember feeling a similar sense of shock when I read about infinitary logic, which allows certain formulas of infinite length, e.g. conjunctions and disjunctions of up to many formulas.
My feeling was something like "hey, syntax isn't supposed to be so big!"
Alternatively, one could shift over to a notion of multi-sorted Lawvere theory , involving a sort symbol for the ground ring (usually a field) and a sort symbol for the module (or vector space), and tame that syntax back under control. This is not just a stylistic choice. For example, this is the right choice to make if you want to interpret a topological vector space as a product-preserving functor from to topological spaces: you want the interpretation to give a continuous map , a stronger requirement than just having the uncountably many scalar operations be continuous.
That's a good point. In the end I actually enjoyed the shock of thinking about theories that are too big to write down. There's some logician who talked about the "sacred" versus "profane" attitudes toward logic, where in the sacred approach you are struggling to come close as possible to absolute certainty, and thus want to found everything on rules for manipulating finite strings drawn from a finite alphabet, while in the "profane" approach you treat it more like other mathematics, so you're more willing to countenance the use of infinite structures to study logic.
First time I'd had to deal with infinitary logic was when Steve Vickers and Ming Ng wrote a series of papers on real analysis and Ostrowski's theorem and informally using the internal logic of a Grothendieck topos in their papers.
I had a hard time wrapping my head around it because the infinitary nature of the internal logic meant that it was nigh impossible to come up with a finite set of rules to formalise said logic, and it was hard to accept that some things cannot be formalised using inference rules from scratch.
I guess this can be formalised by adding an auxillary level and judgments representing the base topos, like the actual definition of a Grothendieck topos and similar to the multi-sorted Lawvere theories mentioned above, but at the time I was trying to avoid such multi-level theories.
Todd Trimble said:
For example, this is the right choice to make if you want to interpret a topological vector space as a product-preserving functor from to topological spaces: you want the interpretation to give a continuous map , a stronger requirement than just having the uncountably many scalar operations be continuous.
Of course, that choice has the consequence that its algebras are topological modules over any topological ring, rather than specifically over the real numbers.
A way to get specifically topological real vector spaces is to use John's original theory but regard it as topologically enriched, so that the product-preserving functors to spaces have to also be topologically enriched.
Suraaj K S said:
I see. I also think that, in your framework, if you think of things like
λx.x : Nat → Nat
as judgements (which are sorts), then you'd end up having at most one operator for each sort (where the sort appears as the 'target' of the operator), which is ?trivial? and maybe not that useful..
Yes, that's true, but I don't think that necessarily makes it trivial or unuseful.
John Baez said:
That's a good point. In the end I actually enjoyed the shock of thinking about theories that are too big to write down. There's some logician who talked about the "sacred" versus "profane" attitudes toward logic, where in the sacred approach you are struggling to come close as possible to absolute certainty, and thus want to found everything on rules for manipulating finite strings drawn from a finite alphabet, while in the "profane" approach you treat it more like other mathematics, so you're more willing to countenance the use of infinite structures to study logic.
I think it might have been @David Corfield at the n-Category Cafe who spoke of "sacred" (which I just misspelled as "scared", ha!) versus "profane". I like it. Model theorists must be very profane. There's a standard trick where you start with a model of a theory and then create a new theory with a constant term for every element of that model, and that feels like a profane tactic in this sense.
Mike wrote
A way to get specifically topological real vector spaces is to use John's original theory but regard it as topologically enriched, so that the product-preserving functors to spaces have to also be topologically enriched.
Ah yes, where the parameter object of unary operations has the topological structure of .
Todd Trimble said:
I think it might have been David Corfield at the n-Category Cafe who spoke of "sacred" (which I just misspelled as "scared", ha!) versus "profane". I like it.
Actually I learned about this "sacred/profane" distinction from James Dolan, who has several times told me that the terminology goes back to some famous logician. Unfortunately he can never remember which logician.
This is the Cafe post by David. I wonder if James (who also goes by "Jim", which I've settled on having heard him give this name in a coffee shop once, but who doesn't commit to a definitive choice if you ask him) had in mind Dirk van Dalen, the author of this quotation from David's post:
Logic appears in a ‘sacred’ and in a ‘profane’ form; the sacred form is dominant in proof theory, the profane form in model theory. The phenomenon is not unfamiliar, one observes this dichotomy also in other areas, e.g. set theory and recursion theory. Some early catastrophes such as the discovery of the set theoretical paradoxes or the definability paradoxes make us treat a subject for some time with the utmost awe and diffidence. Sooner or later, however, people start to treat the matter in a more free and easy way. Being raised in the ‘sacred’ tradition my first encounter with the profane tradition was something like a culture shock. Hartley Rogers introduced me to a more relaxed world of logic by his example of teaching recursion theory to mathematicians as if it were just an ordinary course in, say, linear algebra or algebraic topology. In the course of time I have come to accept this viewpoint as the didactically sound one: before going into esoteric niceties one should develop a certain feeling for the subject and obtain a reasonable amount of plain working knowledge. For this reason this introductory text sets out in the profane vein and tends towards the sacred only at the end. (p. V)
Abraham Robinson was probably someone who was relaxed and profane. Tarski: definitely not.
I bet Jim read this quote! Thanks - I'll ask him.
Todd Trimble said:
This is the Cafe post by David
Good to revisit that discussion.
With the use of categorical logic now to plan one's breakfast have we reached the full extent of the profane, or do we recover a sense of the sacred in the everyday?
Yes, this seems to be where James Dolan heard this story (and then recounted it to me):
yes, a lot of the specifics seem as i remember it, for example having grown up in the sacred tradition but then converting under hartley rogers's influence, seeing logic taught "as though like any other mathematical field", etc ....
This sentence is on the Wikipedia article on Model theory too!
https://en.m.wikipedia.org/wiki/Model_theory
Sorry for the off-topic, but this sentence from David Corfield
With the use of categorical logic now to plan one's breakfast have we reached the full extent of the profane, or do we recover a sense of the sacred in the everyday?
inevitably reminds me of the historian of religions Mircea Eliade, who wrote often about hierophanies of the sacred breaking into the everyday and quotidian, typically in camouflaged form. I just wonder whether David was also recalling him, when he wrote that sentence.
I suppose category theory is my religion, and I probably often sound that way, too, with the Yoneda lemma being a holy of holies. On the other hand, when I'm doing a categorical calculation at home, it just feels like an ordinary piece of algebra, without any feeling of "awe and diffidence". So I guess I don't practice what I preach. ;-)
Todd Trimble said:
I just wonder whether David was also recalling him, when he wrote that sentence.
Someone recently recommended Eliade to me, but I haven't started yet.
I think it was part Buddhist ("Show me where the Buddha isn't, and I'll spit there") and part reflection on the ingenuity that allowed everything to come together in the same place to be able to make that breakfast, right back to the domestication of wheat and the rise of metalworking.
"Show me where the Buddha isn't, and I'll spit there" <-- what a completely fantastic utterance, almost like a koan. It reminds me of a question a student asked after a talk by Bernie Glassman, a Zen teacher/roshi, something on the order of "how can I fully live in the here and now?", and Glassman replied, "Will anyone who is not here now please stand up!".
I was reading "Notions of Computation and Monads" (https://www.cs.cmu.edu/~crary/819-f09/Moggi91.pdf). Here, the author talks about three semantic-based approaches to prove equivalence of programs - operational, denotational, and logical.
I think it's safe to say that there's no clear categorical semantics analog of program equivalence using operational semantics, which, if I understand correctly, is defined in terms of 'contextual/observational equivalence' of program terms.
On the other hand, I think that the logical approach might just be the game of categorical semantics (an adjunction between a category of structured categories and the category of theories, although I don't know whether the programming language would be the doctrine or a theory).
Denotational semantics is also probably a part of this framework, where your intended model is just a structured category (or maybe a functor too, I'm not too sure)?
I think it's safe to say that there's no clear categorical semantics analog of program equivalence using operational semantics, which, if I understand correctly, is defined in terms of 'contextual/observational equivalence' of program terms.
If there isn't, it could be that category theorists have work left to do! There are definitely papers that try to understand operational semantics using category theory. However, since I'm not an expert on operational semantics, I'm not sure how close they come to capturing the concepts that experts in operational semantics want captured!
I'm not enough of a computer scientist to be entirely confident in my answer, but reading what he says, it does sound like what he calls the "logical approach" is indeed just what we call categorical semantics, and what he calls the "denotational approach" is just a special case of that for a particular "intended" model.
I think usually the "intended model" in denotational semantics is some kind of category of domains. You can try https://en.wikipedia.org/wiki/Denotational_semantics to start with.
Mike Shulman said:
I think usually the "intended model" in denotational semantics is some kind of category of domains. You can try https://en.wikipedia.org/wiki/Denotational_semantics to start with.
Would it be correct to say that the intended model is just a category of domains? Or does it also include the interpretation functor (which is a morphism in the category of structured categories)?
It includes the functor.
The view I take on operational semantics is that it gives a recipe for building a categorical semantics via some sort of realizability model. There are lots of knobs you can turn here, but the basic plan typically follows the following shape:
This is all rather hand-wavy, so let's work through a concrete example. Our first move is to pick a theory of computation: I'm going to go with Hoffstra's basic combinatorial objects (BCOs), as they nicely model small-step operational semantics. In particular, a BCO consists of:
We should think of as being a sort of computable weak composite of two computable functions.
A nice example of a BCO is the untyped lambda calculus equipped with a call-by-value operational semantics:
which is pretty straightforward to show.
Finally, let's show how we can link up to categorical logic by showing how to turn a BCO into a fibration of preorders of $A$-valued predicates.
In more intuitive terms, whenever we have an -computable function that turns -computable evidence of into -computable evidence of .
From here, we can use the normal tools of categorical logic to build even more interesting fibrations from , use the internal logic of , etc. What this ends up buying us is the ability to prove that certain programs (a) exist and (b) obey some specification.
Would you say that such a realizability model could then be considered an "intended" model and thereby become a "denotational" semantics?
To be honest, I don't really like the term "denotational". Typically, it's defined as "constructing a bunch of mathematical objects that describe the meaning of programs", but this includes any notion of semantics lol.
Putting terminological gripes aside, here's my understanding of what the linked PDF is trying to get across. In the "denotational approach", we are doing all of our work within a single category (the "intended model"). This is typically how we'd work when using some sort of domain-theoretic semantics, as the category of domains is typically rich enough that we don't need to leave. Conversely, in the "logical approach", we are working in a doctrine rather than a single category. A more apt description might be "petit"/"gros".
Realizability models kind of break this distinction, as you tend to do a lot of work inside a realizability model, but you also are interested in relating it to other models.
Reed Mullanix said:
To be honest, I don't really like the term "denotational". Typically, it's defined as "constructing a bunch of mathematical objects that describe the meaning of programs", but this includes any notion of semantics lol.
Oh good, I'm glad I'm not the only one who feels that way.
I tend to consider semantics "denotational" when there's a more-or-less-compositional mapping that is doing most of the work.
Doesn't a "compositional mapping" just mean that it's a functor, which is true of any notion of semantics?
One more plausible definition is that you are looking at models in concrete categories; perhaps “concrete semantics” is a better descriptor?
Mike Shulman said:
Doesn't a "compositional mapping" just mean that it's a functor, which is true of any notion of semantics?
If I understand my PL right, I think that a compositional map is defined inductively over the syntax of the programming language.
My platonic ideal of denotational semantics is that it's basically defined by structural recursion on the syntax, yeah.
The more you need some fancy recursion scheme that can't be explained as "structural" the less denotational it is.
Would it be fair to say that the 'structured categories' in the Moggi paper are categories with monads?
I guess 'computational λ-calculus' is the doctrine?
I'm failing to see how you could define a notion of semantics other than by recursion on the syntax. Can you give an example?
Reed Mullanix said:
One more plausible definition is that you are looking at models in concrete categories; perhaps “concrete semantics” is a better descriptor?
Do you mean "concrete" in the precise sense of "admitting a faithful functor to Set"?
A big-step or small-step operational semantics will often recurse on the same piece of syntax many times or even no times. This is recursion but it's not structural recursion.
Sorry, I still don't get it. Are you talking about -reduction steps?
Yes. I think that James is talking about β reduction steps.
If I understand correctly, in small-step operational semantics (also called transitional semantics), you define the next step relation/function using recursion on the syntax (and I think that this 'syntax' is not just the syntax of the programming language, but you could have additional things like 'stores'/'memories', etc.). For instance, we could be giving the definition of a function step: (Program, Store) → (Program, Store)
.
The 'real' semantics is then defined using the next step relation. 'Real' semantics should (imo) be a function which maps Programs to some semantic domain. For instance, in our example above, real_semantics : Program → [Store → (Store + 'diverge')]
.
On the other hand, when we want to give a compositional (aka denotational aka fixedpoint) semantics, we 'directly' define the 'real_semantics' by recursion on the syntax of programs. Moreover, I think that often, you'd need fancy semantic domains to be able to do this.
Hi I learnt that there are three forms of semantics of programming languages: operational, denotational and axiomatic. I think I learnt it from Andy Pitts' lecture notes, slide 3. The three forms can be related via various theorems but they have different aims and people don't always state or prove them.
In his lecture notes, Andy does talk about equivalence. But I think often operational semantics often doesn't focus on equivalence of programs, e.g. this is only very briefly hinted at in the operational semantics in the definition of Standard ML. When I started out, it seemed that a lot of people were mainly proving the theorems in the style of Wright and Felleisen, which was very influential.
Let's consider the doctrine which is the following: λ-calculus, with a base type o and function types only. In this doctrine, I think that theories are just sets of equations, where an equation is a 4 tuple (H, M, N, τ), where H is a context, τ is a type, and M, N are (well-typed) terms.
In this 'doctrine', there is a notion of soundness and completeness with respect to 'type frames' (for example, as defined here: https://fsl.cs.illinois.edu/teaching/2021/cs522/CS522-Fall-2021-Simply-Typed-Lambda-Calculus.pdf page 33).
I'm wondering whether we can consider 'type frames' as structured categories (as one does in the adjunction/equivalence between the category of structured categories and the category of theories)?
Normally I would consider function types to be part of the doctrine, but the base type o
to be part of a theory.
In general a theory can contain any number of base types, any number of undefined term symbols, as well as axioms.
Would it be safe to say that categorical models are a generalization of these 'type frame' models? I think people do say this, but I'm not really sure what this means
I expect with a bit of work, a 'type frame' can be turned into a model in Set, since it associates each type to a set.
Here's a question which is particularly vague (mostly because I don't understand it things well enough), but here it goes.
There seem to be 'type-theory' style proof assistants. These include coq, agda, nuprl, etc. These have dependent types. The Curry-Howard correspondence also applies here. Proofs are programs / morphisms into some proposition object. Propositions are types.
There are also 'logic' style proof assistants. For instance, Isabelle/HOL. Here, propositions are terms, and what you need to prove is p = true
, for a proposition p. Here, proofs are not first class objects that you can manipulate, like you can do in dependent type theories.
Here's my vague question: Does the framework of categorical logic only apply to the first kind of doctrine? Maybe a related question is as follows: when you have the doctrine / type theory for first order logic, is a proof a morphism to a proof object (the first style), or is a proof a demonstration that two morphisms are equal (the second style)?
Well, for one thing, I don't think the dichotomy is as sharp as that. Most dependent type theory based proof assistants have Russell-style universes, in which case propositions, like all types, are also terms, namely terms belonging to some universe. And some such proof assistants now have universes of proof-irrelevant propositions ("SProp") such that any two proofs of such a proposition are definitionally identical, so while they are still technically first-class objects you never need to know or think about them beyond whether or not there is one.
I would say the main difference is in whether the judgmental/dependency structure is "stratified" or not: whether propositions belong to the same "world" of types so that types can depend on propositions as well as vice versa, or whether propositions are allowed to depend on types but not the other way around. These two choices correspond to two different kinds of categorical structure: in one case the fibration of types/propositions has "comprehension" whereas in the other it doesn't. But both have a categorical semantics.
Mike Shulman said:
I would say the main difference is in whether the judgmental/dependency structure is "stratified" or not: whether propositions belong to the same "world" of types so that types can depend on propositions as well as vice versa, or whether propositions are allowed to depend on types but not the other way around. These two choices correspond to two different kinds of categorical structure: in one case the fibration of types/propositions has "comprehension" whereas in the other it doesn't. But both have a categorical semantics.
If propositions are allowed to depend on types, but not the other way around, I'm guessing that that would correspond to 'logic'-style (perhaps this is bad terminology), where you typically want to prove p=true
?
Suraaj K S said:
Here's my vague question: Does the framework of categorical logic only apply to the first kind of doctrine?
Certainly not. Lawvere's work on hyperdoctrines is I think the original "categorical logic" and it is models first-order logic.
Suraaj K S said:
If propositions are allowed to depend on types, but not the other way around, I'm guessing that that would correspond to 'logic'-style (perhaps this is bad terminology), where you typically want to prove
p=true
?
Whether you phrase the proof rules as "proving p
" or "proving p=true
" is an irrelevant detail of syntax, it doesn't change the categorical semantics.
I am a little curious about the following - theories of doctrines don't need to have a special sort Prop
, right? However, I'm guessing that when people say first order logic, they are referring to a theory which has Prop
in the first order logic doctrine? Or does the doctrine itself have Prop
built in?
A doctrine can stipulate the existence of certain special types like Prop, in addition to whatever base types are specified by different theories in that doctrine. However, when I think of first-order logic, I don't think of Prop as being a type but rather a judgment. When Prop is a type, then that's more like higher-order logic.
I see. That makes sense. I was reading https://www.google.com/books/edition/Theories_of_Programming_Languages/X_ToAwAAQBAJ?hl=en&gbpv=1&pg=PA1&printsec=frontcover.
The book starts off with the syntax and semantics of predicate logic (section 1.1-1.2). Here is a summary of the syntax:
The next section defines denotational semantics.
<Var> → Nat → Nat
. That is, a semantic domain maps a state to a valueIt is fairly straightforward to define these semantic functions.
If we want to view this using the lens of categorical semantics, I would say that the text described a theory in a doctrine, and not a doctrine itself. We would want to be able to have other theories in the doctrine, besides the theory of natural numbers, I guess.
What would be the 'right' doctrine in this case?
Another thing that I notice is that the next doesn't really give axioms, but just gives an intended model. On the other hand, the description of doctrines must (?) have axioms too?
One small note is that variables are not nullary operations, but just a standard term-constructor. All theories contain variables, so they aren't specified as part of the theory the way constants and operations like +
are.
Overall, I haven't looked at the book, but from your description it looks like the doctrine is just first-order logic. The theory in this doctrine then has one sort (called "integers"). As I said, in first-order logic, assertions/propositions are another judgment, not a sort.
As for axioms, a theory in a doctrine is allowed to contain axioms, but it could decline to exercise that right (or, equivalently, have an empty set of axioms).
So it looks to me as though this is a description of a particular theory in the doctrine of first-order logic (with the doctrine and the theory a bit mixed together in the presentation), with one sort called "integers" and some operation symbols (nullary constants, binary plus, etc.).
Mike Shulman said:
One small note is that variables are not nullary operations, but just a standard term-constructor. All theories contain variables, so they aren't specified as part of the theory the way constants and operations like
+
are.
Oh yes. That makes sense. The book has <IntExp> := 0|1.... | x | y ... , etc. which I blindly translated..
The text however doesn't have 'assertions' variables too, like it does for 'integer expression' variables. Maybe that doesn't make it as clean as it should be
No, I wouldn't expect it to.
That goes along with propositions being a judgment and not a sort: you can't (in ordinary first-order logic) hypothesize an arbitrary proposition.
There seems to be three notions of what 'an algebra' is:
I was wondering if the above notions of 'algebra' are related? Maybe the endofunctor version is just a generalization of the monad version? I'm not sure how this links to the categorical logic version, though..
I don't know what "just a generalization" means, but algebras of monads seem quite different from algebras of endofunctors. Note that if a is a monad, an algebra of the monad is not the same as an algebra of the underlying endofunctor of .
I don't know a way to systematically convert a monad into an endofunctor such that algebras of the endofunctor are equivalent to algebras of the endofunctor . Nor do I know a way to systematically convert an endofunctor into a monad such that algebras of the monad are equivalent to algebras of the endofunctor . (If such processes exist, even under special conditions, I'd love to know about it!)
On the other hand, there's a famous result due to Lawvere that if is a finitary monad on the category , then there's a Lawvere theory such that algebras of are equivalent to algebras of . And this is reversible: for any Lawvere theory we can find a finitary monad on with the same algebras.
John Baez said:
Nor do I know a way to systematically convert an endofunctor into a monad such that algebras of the monad are equivalent to algebras of the endofunctor .
This is called the [[algebraically-free monad]] generated by the endofunctor, and exists under mild conditions such as for an accessible endofunctor on a locally presentable category.
Are 'Lawvere theories' a synonym for "theories in the doctrine of equational logic"?
https://ncatlab.org/nlab/show/Lawvere+theory seems to tell me that. I just want to make sure that I'm not misunderstanding
Perhaps because I don't share your background, I don't know what you think "equational logic" includes. Lawvere theories only describe n-ary operations obeying universal equational laws.
A Lawvere theory is sort of the "extensional essense" of a single-sorted theory in the doctrine of equational logic. It doesn't privilege any particular generating operations or axioms, but includes all the derivable operations and equations between them on the same footing.
John Baez said:
Perhaps because I don't share your background, I don't know what you think "equational logic" includes. Lawvere theories only describe n-ary operations obeying universal equational laws.
What I'd meant was what people call 'algebraic theories', etc. For example, as seen here: https://awodey.github.io/catlog/notes/catlog1.pdf
Mike Shulman said:
A Lawvere theory is sort of the "extensional essense" of a single-sorted theory in the doctrine of equational logic. It doesn't privilege any particular generating operations or axioms, but includes all the derivable operations and equations between them on the same footing.
I see that makes sense. So, from every single-sorted theory of equational logic, I'm guessing that you 'can create' a lawvere theory. It does seem to be related to the 'syntactic' category that the theory gets mapped to under the 'term-model' functor of the adjunction. If I understand correctly, the syntactic category is also freely built this way..
This answer: https://math.stackexchange.com/questions/1520546/difference-between-lawvere-theory-and-its-syntactic-category seems to say that they (lawvere theory and the syntactic category) are the same
Yes, that's right: the Lawvere theory associated to a syntactic theory is the syntactic category, at least when it's regarded as a theory in the doctrine of finite-product categories. (You can also regard an equational theory as living in the doctrine of cartesian multicategories, in which case its syntactic (multi)category contains the same information but isn't exactly what is usually referred to as a "Lawvere theory".)
More on your original question about the three notions of theory:
So you can see that the three notions are closely related, though not identical.
I'm curious if something like CiC (Calculus of inductive constructions) would fit in the doctrinal framework.
So far, we have been thinking about doctrines like equational logic, STLC, etc. I am guessing that the Calculus of Constructions can also be thought of as a doctrine. This is the 'topmost' point of the lambda cube. Furthermore, I am guessing that we could have 'theories' in this doctrine, such as the theory of natural numbers, binary trees, etc.
The Calculus of Inductive Constructions (baby coq), in some sense has infinitely many datatypes (in that you can define new datatypes). Can we still consider this as a 'theory' of CoC? Or would CiC be a doctrine in its own right? In that case, what would its theories be?
Note: I picked CiC arbitrarily, and I could have probably asked the same question about Agda, etc...
STLC already has infinitely many types, generated by the rule . I don't see the inductive datatype-forming operation in CIC as fundamentally different, although of course it's more complicated.
Rules that generate new types are never part of a theory, though; they have to be specified in the doctrine. So CiC is a doctrine that extends CoC.
Theories in CiC would be, like theories in any other doctrine, specified by giving some generating types, terms, and axioms.
BTW, saying "CiC" has an advantage for this discussion over saying "Coq" or "Agda" in that it is closer to precisely mathematically specified than any real-world proof assistant is.
Mike Shulman said:
Theories in CiC would be, like theories in any other doctrine, specified by giving some generating types, terms, and axioms.
That makes sense. Would it be correct to say that the doctrine of CiC has all inductive datatypes built-in? So things like the Natural numbers, etc. exist for all theories in the doctrine of CiC.
This 'base' theory in the doctrine with no sorts and axioms is probably what the core of Gallina is. Is it possible to 'do' other theories with more sorts in Gallina? The axiom
keyword lets us add axioms. Is there a way to add sorts?
Yes, that's right. If by "sort" you mean the same thing as a "type", then you can add generating types in Coq with Axiom A : Type
.
One way you'd do this is to add:
inductive d
that lifts descriptions into typesinductive d
, along with associated beta laws (you could add eta laws, but this immediately makes typechecking the resulting theory undecidable)Basically every system follows the same pattern for steps 1,2, and 4; the variations typically have to do with how inductive descriptions are turned into types. For instance, CiC does not have an inductive d
type former; instead, they modify all their judgements to be parameterized by an extra context of datatype descriptions. This is typically done to ensure that equality of inductive types is not structural. For instance, the following two types are not equal in CiC.
Inductive nat : Set :=
| zero : nat.
| suc : nat -> nat.
Inductive num : Set :=
| zilch : nat.
| plus_one : nat -> nat.
That being said, both the extra inductive context and type-former approaches can be expressed as generalized algebraic theories.
I'm curious to know whether the doctrinal framework can be used with the untyped lambda calculus as a doctrine too? I was wondering, for example, if Dana Scott's notion of a model of the untyped lambda calculus can be thought of as some structure preserving functor from a syntactic category to another?
Generally speaking, from a doctrinal/categorical/type-theoretic perspective, anything "untyped" is really "unityped", which is a property of a theory rather than a doctrine.
Yes, it's kinda like the self-proclaimed "untyped" theories think they're not using types, but the view from outside is that they're using just one type - and that's the right view if you're trying to relate those "untyped" theories to the rest of the world.
And with that said, then yes, models of the untyped lambda calculus are special cases of models of theories in the typed lambda calculus.
Mike Shulman said:
And with that said, then yes, models of the untyped lambda calculus are special cases of models of theories in the typed lambda calculus.
That makes sense.
I'm wondering if we can consider the untyped lambda calculus be a theory of a weaker doctrine like equational logic as well?
Naively, I think that nothing stops us from doing so, as we could include a whole bunch of axioms that makes the theory 'just work'.
I don't think that's possible with a naive approach, since equational logic doesn't have a notion of "variable binding" that you would need to describe lambda-abstractions. But you could perhaps encode combinatory logic instead.
Sorry to come late again. Just to add, enriched equational logic / enriched Lawvere theories / enriched monads can allow variable binding. For the simplest case, enrich in . I call the syntax `parameterized algebraic theories', and we included the theory of untyped lambda calculus very briefly at the end of our paper about jumps.