Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: learning: questions

Topic: Enumerating properties of categories


view this post on Zulip Julius Hamilton (Apr 05 2024 at 23:59):

This is a sub-question I would like to sort out before returning to my thoughts on terminal objects.

I would like to enumerate, in a logical language, all possible properties which a category could have. By property, I mean anything like, “being a thin category”, “being a dagger category”, “having a terminal object”, etc.

I am going to assume all properties about a category will be expressed via quantification over the “elements” of the category. I will only need the universal and existential quantifier. By “elements”, I mean “things” that the category is built out of. For example, a standard category has a set of objects, and a set of arrows. But some category-theoretic structures could have, say, a set of 0-morphisms, a set of 1-morphisms, and a set of 2-morphisms.

To keep this simple, I’ll just explore statements involving a universal quantifier, at first.

The simplest “property” I could think of so far, is:

XC(X=X)\forall X \in C (X = X)

To keep my enumeration simple, I’ll also focus on enumerating properties on the objects, and not the arrows, for now.

Here are some other syntactically simple statements I can think of:

XC(XC    XC)\forall X \in C (X \in C \implies X \in C)

XC(YC(X=Y    X=Y))\forall X \in C (\forall Y \in C (X = Y \implies X = Y))

XC(YC(X=Y    Y=X))\forall X \in C (\forall Y \in C (X = Y \implies Y = X))

XC(¬(X=X)    X=X))\forall X \in C (\neg (X = X) \implies X = X))

To be clear, I am trying to list out every well-formed logical statement about a category, whether it is true or not. I am struggling to formulate how to enumerate over all possible such statements, but I am going to practice stating some less comprehensive ones first.

view this post on Zulip Julius Hamilton (Apr 06 2024 at 00:09):

If we assume XX is a variable ranging over “the objects in a category CC”, then let’s consider all statements that begin with X:\forall X:. I.e., this is a “single-variable” theory / “fragment of first-order logic”.

I believe that as long as we have some “atomic proposition”, it is already well known what the entire “theory” generated from it is, using “standard” logical connectives - a Boolean algebra (and, there are multiple “functionally complete” choices of operators that generate an equivalent structure, such as NAND.)

What is interesting is that both on objects and arrows, there is, as far as I can tell, only one “truth-bearing” predicate on them, which is equality. So, if I take “X=XX=X as my “atomic proposition”, then there is a Boolean algebra generated from it.

For this reason, it shouldn’t be that hard to enumerate statements about objects. We only have one possible predicate. The only other thing we can “vary” are the number of variable terms - for example, X(Y(Z(X=Y=Z)))\forall X (\forall Y(\forall Z(X = Y = Z))).

view this post on Zulip Julius Hamilton (Apr 06 2024 at 00:09):

There is more potential interest when enumerating statements about arrows, because there, we have 2 predicates, not one: composition and equality. (Actually, composition is a function, not a predicate.)

view this post on Zulip Julius Hamilton (Apr 06 2024 at 00:14):

I’m thinking maybe I can pinpoint underlying “enumerative axes”, then put them all together. You can enumerate over:

view this post on Zulip Julius Hamilton (Apr 06 2024 at 00:22):

I think I see it so much clearer now.

The zero-order terms are a collection of objects, a collection of arrows, and a collection of truth-values.

There is a first-order function called composition, which takes 2 arrows as arguments. Its terms look like Comp(f,g)Comp(f, g).

There is a first-order function, =Ob=_{Ob}, which takes 2 objects and 1 truth value. Its terms look like =Ob(A,B,True)=_{Ob}(A, B, True).

There is a first-order function, =Ar=_{Ar}, which takes 2 arrows and 1 truth value (analogously to =Ob=_{Ob}).

There is a first-order function Dom, which takes an arrow, an object, and a truth value. Its terms look like Dom(A,a,False)Dom(A, a, False).

There is a first-order function Cod (analogous to Dom()Dom()).

view this post on Zulip Julius Hamilton (Apr 06 2024 at 00:25):

I think it should be possible to state a “closed formula” to enumerate all possible terms, in the above system.

view this post on Zulip David Egolf (Apr 06 2024 at 00:25):

You may find this (broadly related) nLab article interesting to browse: stuff, structure, property. I really like how it talks about "stuff", "structure" and "properties" in terms of functors "forgetting" different things.

On a related note, you may also enjoy taking a look at section 2.4 of Lectures on n-categories and cohomology. That section starts out like this:

Stuff, structure, and properties. What’s all this nonsense about? In math
we’re often interested in equipping things with extra structure, stuff, or properties,
and people are often a little vague about what these mean. For example, a group
is a set (stuff ) with operations (structure) such that a bunch of equations hold
(properties).

You can make these concepts very precise by thinking about forgetful functors.

I'm not sure how related these resources are to what you've been discussing in this topic. But these resources talk about "properties", which is what brought them to mind for me!

view this post on Zulip Julius Hamilton (Apr 06 2024 at 00:38):

I will enjoy reading them greatly. Thank you!

view this post on Zulip Julius Hamilton (Apr 06 2024 at 00:44):

I’m going to respond to the information you shared, because especially the idea of a “stuff type” and a “structure type” has to do with some of my thoughts.

That said, I want to try to finish my train of thought about how to enumerate the expressions above.

view this post on Zulip Julius Hamilton (Apr 06 2024 at 00:54):

It appears that a first order function acts like a “formation rule” on zero order terms.

I said I had 5 first-order functions: Dom, Cod, =Ob=_{Ob}, =Arr=_{Arr}, and \circ.

\circ is binary; the other 4 are ternary.

\circ is restricted to the sort arrows.

Then, it is easy to enumerate the terms it will generate. Let’s say we have nn arrow symbols. Then in the first argument, we will have nn choices, and for each of them, nn choices for the second argument. Then there are n2n^2 \circ-terms.

=Ob=_{Ob} and =Arr=_{Arr} are analogous. (I suppose I realize I did not specify any properties for equality, like symmetry and transitivity, so I’ll have to work that out later). If we keep with the simple formula of multiplying the size of the set that each argument takes values in, then =Ob=_{Ob} has 2o22o^2 terms, where oo is the number of object symbols, and similarly, =Arr=_{Arr} has 2a22a^2 terms.

Dom and Cod are simple - both are 2oa2oa.

view this post on Zulip John Baez (Apr 06 2024 at 00:54):

Julius Hamilton said:

Here are some other syntactically simple statements I can think of:

XC(XC    XC)\forall X \in C (X \in C \implies X \in C)

XC(YC(X=Y    X=Y))\forall X \in C (\forall Y \in C (X = Y \implies X = Y))

XC(YC(X=Y    Y=X))\forall X \in C (\forall Y \in C (X = Y \implies Y = X))

XC(¬(X=X)    X=X))\forall X \in C (\neg (X = X) \implies X = X))

To be clear, I am trying to list out every well-formed logical statement about a category, whether it is true or not.

Above you're only listing some true ones, and omitting some simpler false ones.

It's much easier to list all well-formed statements, regardless of whether they're true or not. Logicians call these 'sentences'. And it's even easier to list the 'formulas'. An example of a formula that's not a sentence is

X=Y    Y=Z X = Y \implies Y = Z

and an example of a sentence is

XYZ(X=Y    Y=Z) \forall X \forall Y \forall Z( X = Y \implies Y = Z)

(Here I'm not bothering to write XCX \in C, etc.)

Wikipedia gives the rules for generating all formulas here. First you choose your non-logical symbols, then you inductively define 'terms', and then, using those, you inductively define 'formulas'.

None of this is really about category theory: you are really studying first-order logic. The theory of categories, at least the way you're starting to describe it, is an example of a theory in first-order logic.

view this post on Zulip John Baez (Apr 06 2024 at 00:57):

By the way, where I said the word 'true' above, I would normally say 'provable'. Only God knows what's 'true' (though I heard even she has some doubts). But we can show certain sentences are provable using certain axioms and certain deduction rules. Indeed we can write computer programs to do this.

view this post on Zulip Julius Hamilton (Apr 06 2024 at 00:58):

I’m familiar with that Wikipedia page, but I don’t find the presentation clear enough.

view this post on Zulip Julius Hamilton (Apr 06 2024 at 01:01):

For example, that presentation of FOL distinguishes between terms, which are well-defined “entities”, and formula, which are expressions which map to “true” or “false”. My presentation above does not make this distinction. It just includes the set of truth-values as other terms.

view this post on Zulip John Baez (Apr 06 2024 at 01:07):

Hmmm. So you're blending the concept of "theory" and "model" in a nonstandard way. Is there anything unclear about the Wikipedia page - i.e., hard to understand - or is it just that you don't like it? They're trying to present first-order logic in the usual way. If there's anything unclear about it, I might fix it.

view this post on Zulip Julius Hamilton (Apr 06 2024 at 01:12):

I appreciate you saying that. I can try to answer your question in detail, but I need to keep going in my own investigation to figure out what it is I am trying to work out. I’m about to post my thoughts on a possible enumeration algorithm. (After that, I have to revise the algorithm to only generate “self-consistent” theories, hopefully - ie, not generating “nonsense” categories where the arrows don’t compose).

I know of only one philosophical angle that might be relevant - there is apparently a minority view in logic that thinks “there is no real semantics; it’s all syntax”. I am really interested in this perspective. I think it’s touched on here (but it’s not that important for me to go on in my approach.)

view this post on Zulip John Baez (Apr 06 2024 at 01:14):

If you want to avoid semantics and stick to syntax, the usual way is to inductively define "provable" sentences given a set of axioms. Then purely syntactic rules let you enumerate all the provable sentences.

view this post on Zulip John Baez (Apr 06 2024 at 01:14):

The map from sentences to "true" and "false" is part of what we call semantics.

view this post on Zulip John Baez (Apr 06 2024 at 01:15):

But anyway, go ahead and try stuff! I do a lot of that too.

view this post on Zulip Julius Hamilton (Apr 06 2024 at 01:42):

^That’s probably what I’m going for. Anyway, here is the next chunk of thought:

Basically, my intuition tells me that it will be impossible to write the “formation rule” to generate all terms, without recursion, but I need to dwell on why that is.

I have a hope that each “n-order” function in the theory can be “instructed” to apply itself to every term in a collection (of the relevant sort). After it does that, it is instructed to do the same thing again (since it just generated a bunch of new terms).

This may lead to the issue of, “but in what order should you choose to apply the n-ary functions”? My hope is, you could enumerate “one level up” over them, as well.

Something like this (maybe):

3 zero-order “functions”:
objects = {a,b,c,}\{a, b, c, …\}
arrows = {x,y,z,}\{x, y, z, …\}
truthvalues = {0,1}\{0, 1\}

5 first-order “functions”:
=Ob(x:Objects,y:Objects,z:truthvalues)=_{Ob}(x : Objects, y : Objects, z : truthvalues)
=Arr(x:Arrows,y:Arrows,z:truthvalues)=_{Arr}(x : Arrows, y : Arrows, z : truthvalues)
Dom(x:Objects,y:Arrows,z:truthvalues)Dom(x : Objects, y : Arrows, z : truthvalues)
Cod(x:Objects,y:Arrows,z:truthvalues)Cod(x : Objects, y : Arrows, z : truthvalues)
(x:Arrows,y:Arrows)\circ(x : Arrows, y : Arrows)

(Possibly 2nd-order functions are logical connectives like “and”. The problem here is I haven’t defined my functions to have a “return type”, so maybe I have to start over in that regard.)

(Note that these are “syntactic” functions - they just form new syntactic units, by plugging in a specific value into the variables.)

(Maybe later we can add in an ability to define a “category” with a “membership” function, equality of categories, etc.)

As sketched in a previous message, it is easy to “enumerate” all terms of one of the above functions, with universal quantification over its arguments (which corresponds to “for-loops”):

x,y,z[=Ob(x,y,z)]\forall x, y, z [=_{Ob}(x, y, z)] suffices.

There is some reason I can’t see clearly now why this isn’t good enough.

The main idea I was getting at was, there is a way to enumerate a function over all the terms in its domain. However, these functions generate new terms. For example, =Ob=_{Ob} can form a new expression, which is essentialy of “truth_value” type. When that happens, a “logical connective” function, like “and”, needs to include that term in its enumeration, in order to count as “comprehensive”.

So, what if we just went in a loop, each function taking turns to enumerate all the terms it can? It reminds me sort of of the Von Neumann cumulative hierarchy.

Thus, we have a… “list” or “collection” of functions: [objects, arrows, truth_values, =Ob=_{Ob}, =Arr=_{Arr}, Dom, Cod, \circ].

Hopefully, there is a general “instruction set” which understands that “when you come across a variable, enter a sub-routine where you loop over every term in that variable’s type”.

So, it would do nothing for the first three nullary functions. When it came to =Ob=_{Ob}, it would enter a sub-routine of iterating over all terms in objectsobjects. For each, it would proceed to the second argument of =Ob=_{Ob}, and enter a second iterative sub-routine. So doing, it would generate all possible “first-generation” terms possible, for =Ob=_{Ob}. It would then continue its top-level iteration, passing to the =Arr=_{Arr} function.

When it has done that, it has appended many new terms into collection truthvaluestruthvalues (actually no, I now realize it should append those terms to a collection called “propositions”).

The hope would be, this would be an effective algorithm in enumerating all terms. And maybe, with some more work, it could enumerate all properties of categories (the original hope) - in that list of expressions, we would find the logical definition of “has all product objects”, etc. Will think more on this.

view this post on Zulip Julius Hamilton (Apr 06 2024 at 02:08):

John Baez said:

Is there anything unclear about the Wikipedia page - i.e., hard to understand - or is it just that you don't like it? They're trying to present first-order logic in the usual way. If there's anything unclear about it, I might fix it.

I am really big on stuff like editing Wikipedia and Stack Exchange, so if you want to discuss this article in detail (ie, how I think it could be improved), that would probably be very, very intellectually stimulating for me.

https://en.m.wikipedia.org/wiki/First-order_logic

view this post on Zulip John Baez (Apr 06 2024 at 04:42):

You said it was "unclear" so I'm curious what's unclear about it.

view this post on Zulip David Egolf (Apr 06 2024 at 05:08):

For what it's worth, I've tried and failed various times to productively engage with Wikipedia articles on logic. (Although I can't point to any particular thing about those articles that I didn't like). Actually, it took me a while to find any resources on first-order logic that worked well for me. These are the two books that I've personally found quite helpful:

view this post on Zulip Julius Hamilton (Apr 06 2024 at 05:12):

(Response to John Baez before David Egolf’s message.)

Well, it’s easier for me to start with the first paragraph, since I feel like I can’t pinpoint just one part of the article.

First-order logic—also known as predicate logic, quantificational logic, and first-order predicate calculus—is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science.

The synonymous terms may be loosely valid, but it isn’t that informationally useful. It would be much better to start off with a clear conceptual definition of what first-order logic really is. A “collection of formal systems” does not leave a strong impression. Listing some fields where it’s used is also not necessarily inaccurate, but kind of a distraction for the first introductory sentence, maybe.

First-order logic uses quantified variables over non-logical objects,

If this is one’s initial introduction to first order logic, this probably is almost impossible to understand. But if you already have some exposure to logic, it’s not very informative or insightful. It’s not that clear what the intended value of the sentence is. Are they trying to highlight that a key characteristic distinguishing first-order logic as opposed to zeroth-order is that it features quantification? If so, the sentence could be completely rephrased to highlight that - and then provide a clear, succinct explanation of what quantification is.

and allows the use of sentences that contain variables,

Didn’t they just say it “quantifies over variables”, and now they are adding in that it “allows the use of sentences that contain variables”? This is sort of diluted word salad to me. There is no compelling informational content. It is not teaching or explaining to someone not just what technical features FOL has, but far more critically, its significance in the history of ideas, why it was developed, how it came to have the form it does, why it’s useful, what problems it solves, and so on.

so that rather than propositions such as "Socrates is a man", one can have expressions in the form "there exists x such that x is Socrates and x is a man",

This is an arguably terrible example to introduce someone to logic. It makes logic seem asinine and trivial, like that you can reformulate useless statements about the world like “I have an umbrella” by replacing the word umbrella with a variable. Of course, a simple example in an introductory context helps someone get familiar with how variables act as referents, but it doesn’t really fit in here in the introduction. It could come later in a more “tutorial” part of the article. I’m not trying to be excessively negative, just trying to claim the article has so much greater potential than its current form.

where "there exists" is a quantifier, while x is a variable.[1]

This sentence is ok with me.

This distinguishes it from propositional logic, which does not use quantifiers or relations;[2]

This part is actually kind of useful and interesting.

in this sense, propositional logic is the foundation of first-order logic.

Sort of. It’s a bit too vague of a claim to be true or false, in my opinion. In calling it “the foundation”, it sort of implies that you need propositional logic to “build” FOL. But that’s not usually how it’s presented or thought about.

I can offer a sketch of what I think a better introduction to FOL would be like, or I can continue my commentary on the next paragraph. The key thing to me is that it should not just list mathematical axioms. It needs to give conceptual motivation for why we would need it, and exactly why everything in it is the way it is (why do we have “terms”, “formulas”, etc.)

Also, the Wikipedia talk page is full of commentary of others suggesting the article is kind of bloated and mish-mash. I’d love to try to revise it, but I got banned from Wikipedia for arguing with an admin! Oh well.

view this post on Zulip Julius Hamilton (Apr 06 2024 at 05:13):

David Egolf said:

For what it's worth, I've tried and failed various times to productively engage with Wikipedia articles on logic. (Although I can't point to any particular thing about those articles that I didn't like). Actually, it took me a while to find any resources on first-order logic that worked well for me. These are the two books that I've personally found quite helpful:

I will check them out, thank you. PDFs are always appreciated but I can google for them. Thanks :pray:

view this post on Zulip John Baez (Apr 06 2024 at 07:27):

@Julius Hamilton - I agree with many of your criticisms of the Wikipedia article "First-order logic", though I'm not sure I'll have the energy to rewrite the introduction.

I hadn't bothered looking at the introduction, since my big concern was whether the article gives a precise specification of some fairly common version of first-order logic... which it attempts to do later on.

view this post on Zulip John Baez (Apr 06 2024 at 07:28):

But of course for many readers the introduction is the most important part.

view this post on Zulip John Baez (Apr 06 2024 at 07:39):

I agree with @David Egolf that anyone who really wants to learn first-order logic should read a book or take a course. I was really into logic as a kid so in high school I took a college course based on Kleene's Mathematical Logic, which is good. But the book by Enderton is also good and I bet the other book David recommended is also good. For logic one thing you need is a systematic careful treatment, and a well-written book can do that far better than any article thrown together by a bunch of people.

view this post on Zulip Julius Hamilton (Apr 06 2024 at 16:24):

Ebbinghaus seems to be right where I am. I’m at a point where I would appreciate its content a lot more now. Thank you David.

view this post on Zulip Julius Hamilton (Apr 06 2024 at 16:51):

(Graduate Texts in Mathematics) Heinz-Dieter Ebbinghaus, Jörg Flum, Wolfgang Thomas - Mathematical Logic-Springer Nature Switzerland (2021).pdf

view this post on Zulip Julius Hamilton (Apr 06 2024 at 16:52):

Stewart Shapiro_ William J. Wainwright - The Oxford Handbook of Philosophy of Mathematics and Logic-OUP USA (2005).pdf

view this post on Zulip Julius Hamilton (Apr 06 2024 at 16:58):

Now that I have more free time, being unemployed, I do try to state a “goal” in my mathematical learning for my day. Today, I will try to study Ebbinghaus, to get enough contextualization for my thoughts about logic, to return to my question of “enumerating properties of categories”, which in turn was part of my desire to understand the “significance” of terminal objects in a new light.

view this post on Zulip Vincent R.B. Blazy (Apr 06 2024 at 18:53):

Julius Hamilton said:

5 first-order “functions”:
$=_{Ob}(x : Objects, y : Objects, z : truthvalues)$
$=_{Arr}(x : Arrows, y : Arrows, z : truthvalues)$
$Dom(x : Objects, y : Arrows, z : truthvalues)$
$Cod(x : Objects, y : Arrows, z : truthvalues)$
$\circ(x : Arrows, y : Arrows)$

(Possibly 2nd-order functions are logical connectives like “and”. The problem here is I haven’t defined my functions to have a “return type”, so maybe I have to start over in that regard.)

(Note that these are “syntactic” functions - they just form new syntactic units, by plugging in a specific value into the variables.)

I’m really not sure why you distinguish function/predicate symbols this peculiar way. Usually, either what (intuitively or semantically) returns or is mappable to truth values is presented (at least in the first-order case) as formula (exo)formers from terms, that is predicate or relation symbols such as equalities here, with no mention of T.V. in the syntax.
And term formers from terms or endoformers of terms as I call them in the big picture of syntax, aka function symbols, are presented again without any truth value involved… Since it’s syntax (even semantics is syntax when formalized, as you mentioned to be interested in, but let’s focus on syntax itself).

Or, predicate symbols may be presented as function symbols into an extra sort of truth values (and then the only remaining actual predicate symbol is equality). But as a codomain not another domain? :thinking:
Conversely, indeed function symbols may be presented as predicate ones, since conceptually functions are functional relations. But then they are formulæ - not term - formers, and the only terms are variables, and no formula is closed without quantifier.

But you, seem to more or less mix them both so weirdly to me: why is $\circ$ the only actual function symbol, when cod and dom should also construct terms for objects just as $\circ$ does for arrows? The T.V. could account for partiality… But they precisely are the total ones when $\circ$ is the partial one (at least in some actual formalization of cat theory, as an essentially algebraic theory)?

Sorry, but I’m lost… :sweat_smile::thinking:

view this post on Zulip Julius Hamilton (Apr 06 2024 at 18:56):

It’s ok - this is exactly the kind of response I want. Seems like you know a lot about this. I’ll read through your response soon when I have a second. Thanks :+1:

view this post on Zulip Julius Hamilton (Apr 07 2024 at 18:51):

Basically, this is what I was looking for: https://arxiv.org/pdf/1210.2610.pdf

They provide an explicit algorithm for generating all expressions of lambda calculus (which is Turing-complete) up to string length nn, and also count the number of terms in each case.

My hope/goal for today will be to make sure I understand how to do that myself, before trying to build on that idea to enumerate “the theory of categories” if possible. I imagine that would consist of writing the axioms for categories in lambda calculus, and then enumerating all derivations from them.

I am pretty sure Andrej Bauer’s Alg does this, which I have been eyeing for a while now.

That said, should probably focus on job-hunting so I don’t run out of money.

view this post on Zulip Vincent R.B. Blazy (Apr 15 2024 at 10:51):

Julius Hamilton Oh okay, so for algorithmics and combinatorics of the syntax of FOL! Ok then, that seems to need nothing different than how this syntax is currently presented… And different syntax would give different answers anyway :upside_down:

For naked categories it would be the generalized or essentially algebraic theory of a category — or, outright ET2CC — with or without λ-terms as a systematic grammar of higher-order functional terms (a mono-sorted simple type theory) or of first-order symbols of functions (in which case their syntactic structural construction is meta), indeed.

Alg doesn’t enumerate theorems but finite models of some theory… I’m not sure how one could, even potentially, be reducible to the other one :thinking:

All the best for your hunt… I’ll go back to that soon too :smiling_face_with_tear:

view this post on Zulip Julius Hamilton (Apr 15 2024 at 12:00):

Vincent R.B. Blazy said:

Alg doesn’t enumerate theorems but finite models of some theory… I’m not sure how one could even potentially be reducible to the other one

Alg enumerates all the models of a theory? That is to say, since there are countably infinite models of ZFC, it could enumerate the models?

But how does it achieve enumerating even one of the models? A “model” is just something for which the “theory” holds true. If we enumerate all the theorems of the theory, what is the difference between that, and a “model” of the theory?

view this post on Zulip John Baez (Apr 15 2024 at 14:17):

Theorems are about syntax: they are the sentences which can be proved using the axioms of your chosen theory. Models are a completely different thing: they're about semantics. Very very roughly, a model is a map sending all items of syntax in your chosen theory (variables, constants, predicates, function symbols, formulas and sentences) to what they "mean" in the world of set theory. So you need to have set theory up and running before you can talk about models!

view this post on Zulip John Baez (Apr 15 2024 at 14:19):

A sentence is provable if you can prove it from the axioms in your theory: this is a syntactic notion, since it just involves manipulating strings of symbols. A sentence is valid if it holds in all models of your theory: this is a semantic notion, since it's defined using models.

view this post on Zulip Julius Hamilton (Apr 15 2024 at 14:19):

Right - but can’t we only define set theory in terms of FOL to begin with? How do we describe the “model”?

view this post on Zulip Julius Hamilton (Apr 15 2024 at 14:21):

If we start a priori with the primitive notions of first order logic, I thought we would assume there is a constant c in the language LL which maps to the empty set - the empty set is the model of the symbol c. Everything else that we know about the universe of sets has to come from rules of inference on the axioms of ZFC.

view this post on Zulip John Baez (Apr 15 2024 at 14:21):

Julius Hamilton said:

Right - but can’t we only define set theory in terms of FOL to begin with? How do we describe the “model”?

You can define set theory in lots of ways, but a standard one is first-order logic. People do this, and then they use set theory to define all the concepts I was just talking about: first-order logic, sentences, axioms, variables, constants, predicates, function symbols, formulas, sentences, models, provabilty, and validity.

view this post on Zulip John Baez (Apr 15 2024 at 14:22):

That's why this subject is called "metamathematics": we are using mathematics (our starting-point set theory, defined using first-order logic) to study mathematics (for example first-order logic).

view this post on Zulip Julius Hamilton (Apr 15 2024 at 14:23):

So when someone says “syntax and semantics”, they actually just mean “metatheory and theory”. Right?

view this post on Zulip John Baez (Apr 15 2024 at 14:23):

No.

view this post on Zulip John Baez (Apr 15 2024 at 14:24):

I really urge reading a good book or two on logic.

view this post on Zulip John Baez (Apr 15 2024 at 14:28):

I can roughly explain ideas: e.g. I wrote a paragraph explaining the difference between syntax and semantics, and I'm not sure it sunk in. But a paragraph-long sketch is not the same as a careful treatment from a book.

view this post on Zulip Vincent R.B. Blazy (Apr 15 2024 at 14:45):

Julius Hamilton said:

Vincent R.B. Blazy said:

Alg doesn’t enumerate theorems but finite models of some theory… I’m not sure how one could even potentially be reducible to the other one

Alg enumerates all the models of a theory? That is to say, since there are countably infinite models of ZFC, it could enumerate the models?

But how does it achieve enumerating even one of the models? A “model” is just something for which the “theory” holds true. If we enumerate all the theorems of the theory, what is the difference between that, and a “model” of the theory?

The key-word here is finite :wink:

view this post on Zulip John Baez (Apr 15 2024 at 14:54):

Vincent R.B. Blazy said:

The key-word here is finite :wink:

This remark fixes one confusion Julius was having: you said Alg enumerates all the finite models of a theory and he asked how it enumerates all the models of a theory. But it doesn't fix another: he apparently didn't know the difference between the models of a theory and the theorems of a theory:

If we enumerate all the theorems of the theory, what is the difference between that, and a “model” of the theory?

So I think he needs to understand 1) what's a model of a theory, and 2) what's a finite model of a theory before he can understand what it means for a piece of software to enumerate all finite models of a theory.

view this post on Zulip John Baez (Apr 15 2024 at 14:59):

That's why I tried to sketch out the concept of "model". The explanation in the Stanford Encyclopedia of Philosophy article model theory sort of sucks because it precisely explains what a structure for a signature is, but then seems to lose interest before getting to the point: when is a structure for a signature a model for a theory! That's right, the article on model theory doesn't come out and say what a model of a theory is. (It says in an off-hand way what it means for a signature to be a model of a given sentence, and we can squeak by with that if necessary, but it doesn't put the defined term in italics, so it's easy to miss.)

The Wikipedia article "First-order logic" does say what a model is, in the section First-order theories, models, and elementary classes. But you definitely have to read and understand most of the article up to that point to follow that definition: you need to know what a
"structure" is, and what are the "sentences in a theory".

view this post on Zulip Julius Hamilton (Apr 15 2024 at 15:02):

That’s good to know. Philosophy Stack Exchange relies on the SEP heavily.

I will read this section of this book today (Ebbinghaus) B267D4F6-FC69-4DA6-97AB-3A94DE774CDD.png

I am trying to write up a little bit right now though about my view on “soundness”.

view this post on Zulip John Baez (Apr 15 2024 at 15:08):

Julius Hamilton said:

That’s good to know. Philosophy Stack Exchange relies on the SEP heavily.

I will read this section of this book today (Ebbinghaus).

Good - I don't know that book, but it looks like he explains syntax and semantics. I can't tell if he defines models. He defines "satisfaction", which gets close.

view this post on Zulip Vincent R.B. Blazy (Apr 15 2024 at 15:15):

Julius Hamilton said:

Right - but can’t we only define set theory in terms of FOL to begin with? How do we describe the “model”?

If we start a priori with the primitive notions of first order logic, I thought we would assume there is a constant c in the language $L$ which maps to the empty set - the empty set is the model of the symbol c. Everything else that we know about the universe of sets has to come from rules of inference on the axioms of ZFC.

To answer as I’d have but taking into account John’s one:
Models can be taken into set theories (or other ones), independently of the fact that those can themselves be FOL theories:
1/Metamathematically — that is, for me, syntactically outside of any specific theory — models or interpretations are formally (meta)definable maps (typically, computable ones) from the syntax one some source theory to the one of another target one, preserving provability: they can perfectly both be based on FOL, or even be the same one! Even if that would often mean loosing the intuition of "meaning in another, actual and better-understood world".
2/One can always mathematize any metamathematics by encoding, internalizing it into the mathematical language of some particular theory, and then defining and proving in it as John describes. That’s conceptually interesting, way more powerful, and useful too; even if that chosen theory may perfectly, at least a priori, lie/be wrong about what it thinks true or not of the actual syntax: it’s limited to only talk internally about its coded version. :upside_down: In that framework, models of a theory — which is now a mathematical object, an abstract individual such as some set, formally defined inside a fixed set theory — are also internal objects such as set-theoretic functions or tuples; and the fact that the object "theory" may be in FOL doesn’t either impact its encodability or not into another, FOL or not, theory — here often called metatheory, even if it's a normal theory (or even an encoded one, if the game is mis en abîme even further) just relatively and locally "meta".

To continue with your example, your c is usually noted \emptyset or {}. :wink: But there are models of set theories studied, usually into other extended set theories, though not only.

So to sum up, syntax and semantics are provability respectively direct or after (the diverse ways of) mapping it into another one, and they both may be metamath or math (at least as far as I see the picture).

view this post on Zulip Vincent R.B. Blazy (Apr 15 2024 at 15:17):

John Baez said:

Vincent R.B. Blazy said:

The key-word here is finite :wink:

This remark fixes one confusion Julius was having: you said Alg enumerates all the finite models of a theory and he asked how it enumerates all the models of a theory. But it doesn't fix another: he apparently didn't know the difference between the models of a theory and the theorems of a theory:

If we enumerate all the theorems of the theory, what is the difference between that, and a “model” of the theory?

So I think he needs to understand 1) what's a model of a theory, and 2) what's a finite model of a theory before he can understand what it means for a piece of software to enumerate all finite models of a theory.

Yes indeed, thanks, hence my second, longer answer, hopefully clarifying a bit syntax and semantics :blush:

view this post on Zulip Julius Hamilton (Apr 15 2024 at 15:34):

Just wanted to state my current understanding of first-order logic, before I read:

We can start out with a single symbol, and with a successor function, generate an alphabet of distinct symbols.

We can then classify the symbols into groups, depending on how we will use them. Some are constants, some are variables, some are functions, some are predicates, some are quantifiers, some are logical connectives, and some are punctuation (the latter of which I believe we could do without).

Let’s call variables and constants “terms”. Intuitively, they represent some things in a domain of discourse. A function applied to a term is considered a new term. This means that we pre-assume there is a little bit more structure in the domain of discourse than just a “collection of things” - there are ways to relate them to each other.

A predicate is like a function, except it takes terms and returns a type called a proposition. (A predicate applied to a term is more normally called a “formula”).

Quantifiers are basically functions: they take propositions (formula) and return a truth value of 0 or 1.

Logical connectives are like functions, but they take truth values and return truth values.

So we have a hierarchy of types: terms \to formula \to Boolean values.

There are two main classes of “formation rules”. From terms we can construct new terms. From formula we can construct new formula. The rules tell us precisely under what conditions we may do so and in what way.

Imagine a theory of only one rule, R1R1. It says “from any formulae of this form, construct a new formula of that form.”

Conventionally, you do not actually evaluate formula to be true or false. Actually, if a formula is derived, it implies it is true. If one writes, xf(x)=3\forall x f(x) = 3, they mean “xf(x)=3\forall x f(x) = 3 is true”. To say it is false, they can simply append the ¬\neg operator to it.

If we had 2 formula, AA and BB, consider if R1R1 says you can construct formula ABA \wedge B. Then we can say A,BABA, B \vdash A \wedge B. That means, “ABA \wedge B is provable from axioms/assumptions AA and BB.”

Here is the drop-off in my understanding. People say a theory is “unsound” if it can prove things that are not true. In the example above, with reference to what could we possible say ABA \wedge B is not actually true?

It’s as if we have to cross-reference 2 formal theories with each other to see if they “match up”.

In order for me to say “it is not the case that AA and BB actually implies ABA \wedge B”, it requires me to “define a semantics” for these symbolic expressions. Sure. But what should it be?

If I decide, “well, with recourse to the axioms of FOL”, then that is circular.

If I say, “well, with recourse to a different formal system than FOL”, then I feel that’s what I’m getting at: the “semantics” have to just be “some other formal system”.

So there is no deep difference between “syntax” and “semantics”. It’s just “syntax” and “syntax”.

I know I’m wrong, just stating my view before I begin reading. :+1:

view this post on Zulip Julius Hamilton (Apr 15 2024 at 15:35):

Vincent R.B. Blazy said:

So to sum up, syntax and semantics are provability respectively direct or after (the diverse ways of) mapping it into another one, and they both may be metamath or math (at least as far as I see the picture).

Thanks - I’ll respond to this soon.

view this post on Zulip John Baez (Apr 15 2024 at 15:55):

@Julius Hamilton wrote:

Here is the drop-off in my understanding. People say a theory is “unsound” if it can prove things that are not true.

There are a lot of things to correct in what you wrote, but I'll just note that this is not the definition of "unsound". The word "truth" never appears in a serious way in mathematical logic - we sometimes use it informally, but not in a serious definition like this. The whole point of modern mathematical logic is renouncing the metaphysical concept of truth and trying to replace it with concepts that are precise enough to prove theorems about.

The two more important replacements for "truth"are provability and validity. I explained, probably too briefly, the all-important difference between provability and validity of sentences, given a theory in first-order logic:

A sentence is provable if you can prove it from the axioms in your theory: this is a syntactic notion, since it just involves manipulating strings of symbols. A sentence is valid if it holds in all models of your theory: this is a semantic notion, since it's defined using models.

We say first-order logic is sound because for every theory in first-order logic, any sentence is valid if it's provable. The soundness of first-order logic is a theorem!

Goedel's completeness theorem states the converse: for every theory in first-order logic, any sentence is provable if it's valid.

Taken together these two theorems are the crucial link between syntax and semantics for first-order logic.

view this post on Zulip John Baez (Apr 15 2024 at 15:58):

When we understand these theorems well, we can relax a little about the difference between provability and validity for first-order logic. But these theorems are not trivial to prove - especially the completeness theorem.

view this post on Zulip Julius Hamilton (Apr 15 2024 at 20:16):

Ok.
According to Ebbinghaus, if SS is a set of symbols, an SS-structure UU is a pair (A,a)(A, a), where AA is a set called a domain, and aa is a map. aa maps relation symbols in SS to actual relations on AA, function symbols in SS to actual functions on AA, and constant symbols to actual elements in AA.

An SS-assignment is an additional map for all the variable symbols in SS to elements of AA.

An SS-interpretation is a pair - a structure, and an assignment.

In other words, the “semantics” for a formal system require you to designate more specifically what set the variables range over, what elements the constant symbols actually refer to, what the nature of the functions in SS are, and concrete definitions (on AA) for all the relations in SS, too.

A common simple theory is the theory of equivalence relations: (symmetry, reflexivity, transitivity). These are axioms. A structure that might satisfy them would be an equivalence relation on integers. An assignment would amount to choosing basically 3 integers to plug into the axioms: 1=11=1, 1=22=11=2 \to 2=1, 1=22=31=31=2 \wedge 2=3 \to 1=3. An interpretation is just a structure and an assignment simultaneously.

I am not sure if my previous example is a model of the theory of equivalence relations or not. I chose a structure (the set of integers, with their notion of equality), and a variable assignment. The second axiom does not say “1=2”, but rather, “if 1 equaled 2, then 2 would equal 1”.

I would say I’m confused about the definition of the satisfaction relation.

On the next page they define the semantic idea of consequence as opposed to the syntactic idea of provability, which seems to be what I’ve been looking for. The idea is that a formula is a consequence of a set of formula if every model of that set of formula is also a model of that formula. In other words, you would never have that set of formula without also having that specific formula.

766D862E-F1BF-4421-959D-A1BFE6BE1358.png

CEB02869-76D9-490F-A0AF-C0F2084F2221.png

view this post on Zulip Julius Hamilton (Apr 15 2024 at 20:55):

I think I understand. “Satisfies” is trivial. Take a collection of symbols. Simply decide what they mean, ie, in what mathematical structure they are interpreted in. Check if they are true.

My interpretation of a theory of equivalence is indeed a model. A way to find an interpretation that is not a model would be to use some relation that is known to not be an equivalence relation. For example, if I keep the equivalence relation axioms, but I interpret the relation symbol RR as the well-known “less than” operator on integers, the axioms fail: 0<00 < 0 is false. It’s an interpretation, but it’s not a model.

What I find odd so far is that a model requires a variable assignment. I thought a model of equality on integers would be the set of all integers, and the standard notion of equality. According to this, you have to choose 3 integers as variables x,yx, y and zz.

view this post on Zulip John Baez (Apr 15 2024 at 23:26):

Where does Ebbinghaus say a model requires a variable assignment? I'd like to see this, to see exactly what he's up to.

view this post on Zulip Julius Hamilton (Apr 16 2024 at 00:06):

67CBF84F-9566-4CE8-9369-ACB9A8B398A0.png

52C3DF4F-2079-4E1C-8E9D-C8076E29343C.png

943EC761-C7A6-49BB-98D4-3BED59A513F4.png

D69E16A2-2E5A-4660-8555-7418A9FA12F4.png

I see now, that he says an “arbitrary” interpretation. So the “assignment” could be over any of the elements of domain AA, rather than choosing specific ones as I thought.

I feel like the idea is that the syntax has no connection to the semantics. There is no way to know if JϕJ \vDash \phi. A human has to just decide that they feel it holds. Or they have to decide it holds in reference to some other mathematical system.

view this post on Zulip John Baez (Apr 16 2024 at 08:44):

Thanks for showing me Ebbinghaus' definition of 'model'. You said it requires a 'variable assignment', so I thought you meant a model assigns to each variable an element of the domain AA. But no - much to my relief, it does not.

I guess you already got this, but anyway:

A model, or more generally a 'structure', does assign to each constant an element of the domain AA. It also does more. But it does not assign to each variable an element of AA.

Then Ebbinghaus introduces a concept he officially calls an assignment, on top of a structure, which assigns to each variable an element of the domain AA. He defines a interpretation to be a structure together with an assignment.

This is more terminology than I usually carry in my head for this branch of logic, but it seems okay. We need assignments so we can see which sentences in a model are 'satisfied'.

For example if we have a sentence

xP(x) \forall x \, P(x)

a 'structure' will need to choose a set AA to be the 'domain' and a 1-ary relation RR on AA for PP. An 'interpretation' will need to do all that and also choose an 'assignment': an element aAa \in A for the variable xx.

Then we say the above sentence is 'satisfied' if R(a)R(a) holds for all interpretations. And clearly it won't be. But the sentence

xP(x)¬P(x) \forall x \, P(x) \vee \neg P(x)

will be satisfied.

view this post on Zulip John Baez (Apr 16 2024 at 08:53):

I feel like the idea is that the syntax has no connection to the semantics. There is no way to know if JϕJ \vDash \phi. A human has to just decide that they feel it holds. Or they have to decide it holds in reference to some other mathematical system.

I think "just has to decide that they feel it holds" is going way overboard with the subjectivity of it all. Your last sentence is more accurate. In the material we're reading, Ebbinghaus is building up the theory of semantics in the usual way we develop math: by definitions, theorems, etc. If we want, we can do all of this very formally, in your favorite set theory. But what Ebbinghaus is doing - mathematical English that can be translated into formal mathematics - is a lot easier to read if you're trying to learn this stuff.

view this post on Zulip Julius Hamilton (Apr 16 2024 at 14:09):

Thanks.

The highlighted section is what I have been trying to express. If you say the notion of truth in the domain of semantics is not formal but psychological, you are sort of renouncing “formalism” as the true basis of mathematical truth. But if you think truth in the domain of the semantics has to be formally established, then you have the problem that you are trying to validate a formal system (the syntax of first order logic as being “reliable”) by checking if it “mirrors” a second formal system (say, an axiomitization of set theory). Does the second formal system need to be checked relative to a third formal system, or should we accept that it doesn’t make sense to do this at all, and just stick to one formal system? I’ll see what I think after reading this chapter. 7D510603-3B3E-4753-B7A2-2EF9EFB144ED.png

view this post on Zulip Julius Hamilton (Apr 16 2024 at 14:11):

He explicitly distinguishes between “formal proof” and “mathematical proof”, and that’s the crux, to me - to understand how that distinction can be made valid.

view this post on Zulip Julius Hamilton (Apr 16 2024 at 14:29):

This is my problem:

7AC0B2C2-4A25-4332-A847-1FB9729CB128.png

C7580C25-E4CC-417A-9C5D-6A192D17624A.png

Anyone who devises a formal system ostensibly has succeeded in taking “human judgment” completely out of the equation. The syntactic formation rules are meant to be “unambiguous” - even a computer can do them.

To believe that your formal system actually “says something about the world” requires an epistemic leap of faith.

That is why I think I am looking for some kind of “intuitionistic model theory” instead.

view this post on Zulip Morgan Rogers (he/him) (Apr 17 2024 at 07:39):

All forms of belief require an epistemic leap of faith. The reasoning principles which are considered valid by the mathematical community are socially and historically (and empirically) determined. Much work in the philosophy of logic has gone into identifying the qualities that make some rules but not others admissible and the basis for those judgements. A typical situation is that a reasoning principle that people were previously using heuristically is promoted ("reified") to a formal reasoning principle without much deeper inspection. Have you ever questioned the rules for reasoning with "and" and "or"? Certainly we see situations in CT where we can substitute those rules for others. What about modus ponens?
The essential thing is that a formal system can indeed not prove the validity of an arbitrary interpretation of its own rules: in using a formal system we must always assume or prove such assertions in a meta-theory. To some extent Löb's theorem "formalises" this observation, but I've seen that theorem abused in the past...

view this post on Zulip John Baez (Apr 17 2024 at 13:44):

Yeah, as a kid I was very worried about certainty, how we can be sure we know things, what assumptions we need to develop mathematics, and how to avoid circularity. It's not really anything special about me: this has been an obsession in western mathematics at least since Descartes! In high school I took a course at a local college where we went through Kleene's Mathematical Logic. Then I logic further when I went to university, working through the proof of Goedel's theorems in a painfully rigorous course with Kripke, while studying the philosophy of Ayer, Quine, Wittgenstein and other such folks on the side. It's very interesting, but one thing you quickly learn is that

All forms of belief require an epistemic leap of faith.

Having spent a decade or so obsessed with this, I have had enough. I'm not interested in certainty anymore, and I don't care how we know or pretend to know mathematical truths. We just muddle along... and then we die. So I don't really much care if people study model theory using an "informal" metamathematics or an already "formalized" system - either formalized in the old sense or in the new sense of doing it with the help of Lean or Coq or something like. I still like thinking about logic, but I treat it like other branches of math, meaning I'm willing to reason about it in all sorts of ways.