You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
This is a small point that I wanted to share.
For some reason, I didn't like the common way I saw Terms inductively defined in first-order logic, which is sometimes like this:
If is a function symbol of arity , and are terms, then is a term.
I am pretty sure that since we said that is a function symbol, the notation "" is not defined in this context. is an element of a set.
We can associate a function symbol with some arguments, by saying "if is a function symbol of arity , and are terms, then is a term".
This still did not feel formal enough for me, and I prefer the following.
Let be a function .
For each , and for each , .
That is to say, if is the arity of function symbol , take the -ary product of the set of terms , and associate with each -ary tuple of terms.
I believe this helps one to move in a more abstract direction where they can look at the elements of first-order logic as well-defined mathematical concepts.
Julius Hamilton said:
For some reason, I didn't like the common way I saw Terms inductively defined in first-order logic, which is sometimes like this:
If is a function symbol of arity , and are terms, then is a term.
I am pretty sure that since we said that is a function symbol, the notation "" is not defined in this context. is an element of a set.
Maybe you're not reading sufficiently careful and precise treatments of first-order logic, then. They would avoid saying is an element of a set, because that brings in set theory, and in a traditional treatment from the ground up, first-order logic comes before set theory: we write the axioms of set theory in first-order logic.
A function symbol of arity (for example) is a letter, like , which we have decreed has this property: if stand for three terms, then is again a term. This is part of a complicated inductive definition of 'term'.
The parentheses are part of the formal notation. This is both deliberate and confusing to a beginner: is not the result of applying a function named to inputs , it's just some syntax, and would be an equally valid choice of syntax. However, the conventional syntax is supposed to evoke the intended semantics: eventually, will be interpreted as an actual function and the term will acquire an interpretation as the result of applying (the interpretation of) to the inductively defined interpretations of .
Yes, I forgot to clearly explain that not only is a mere letter, the parentheses and commas are mere symbols, while are metavariables standing for symbol strings (terms), so that stands for a symbol string. E.g. if stands for , stands for and stands for , then stands for the symbol string
I feel that Julius keeps getting tangled up in these fundamental issues because he hasn't studied an introduction to mathematical logic that goes through them very carefully. My best suggestion is Kleene's Introduction to Metamathematics, which is available on libgen. There might be something better, but I don't know it.
I hope nobody is still using textbooks that treat parentheses and commas as part of the grammar of terms so that is a "well-formed" string of six symbols and is an "ill-formed" string of six symbols.
Parsing is an interesting problem for programmers, but I don't think mathematicians should be worrying about it. It's better to think of as a notation for an abstract syntax tree with a root node labeled by and two children labeled by and .
There are many approaches, but the main problem, as I see it, is that Julius hasn't carefully worked through one super-rigorous approach that starts with bare-bones syntax. The problem with abstract syntax trees is that they're defined using some math one has previously laid down, and Julius is struggling with circularity.
Trying to pick up a bit here and a bit there can be very confusing, I think, before one has mastered one ground-up approach.
To be clear, though, Julius, the AST approach Mike is suggesting is somewhat similar to what you were trying to work up: you can define the set of all terms recursively as containing atomic terms plus functions for every -ary function symbol. This is fine if you don't feel foundational anxieties about using all the notions in the previous sentence, but if you do feel those anxieties then a more mid-century approach like John's pointing you at is the safest.
John Baez said:
There are many approaches, but the main problem, as I see it, is that Julius hasn't carefully worked through one super-rigorous approach that starts with bare-bones syntax.
Yes, I agree.
The problem with abstract syntax trees is that they're defined using some math one has previously laid down, and Julius is struggling with circularity.
I would argue that flat strings of symbols are also defined using math that one has previously laid down. (You have to start with a set of symbols, then consider the set of all finite sequences of elements of that set, etc.) You can't pull yourself up purely by your bootstraps: mathematical logic is a subject of mathematics, so you have to already know some mathematics in order to do it.
Well, in some widely used foundational approach to first-order logic you don't start with an arbitrary set of variables, and you don't even start with a concept of 'set'. You start by being able to write things like this:
x
x'
x''
x'''
x''''
and so on, and you are told these things are called variables.
And so on. To do first-order logic this way you need to already have a level of expressive power that's at the level of Backus-Naur form, not nearly set theory.
If you are a mathematician and you want to study logic freely using all the math you want, that's fine too! That's what I usually do.
You don't need a full set theory but you do need to have some notion of "and so on", and if your goal is to have set up maximally irrefutable foundations you're going to have to take that very seriously. JP Mayberry has an incredibly intensive treatment of all this in my favorite logic book no one else has ever read.
Above I was suggesting Backus-Naur form as a way to handle "and so on" with minimal conceptual prerequisites. In the example above, we can use it to say
<variable> ::= "x" | <variable> "'"
This amounts to saying:
Want to know what a variable is? Okay: x is a variable, and if you have a variable you can get another variable by writing a ' after it.
So we're assuming some basic competence on the part of the user: they can remember the variables they've cooked up so far, and they know they can get a new one by writing a ' after one they already have.
It’s a bit beyond me to see what the actual semantics of that discussion is other than “there is a countable set of variables”, insofar as your description of what a variable is is -equivalent to the standard inductive definition of the natural numbers…but this is probably being annoying in a way that’s not useful in this context, and I’ll let it be unless someone else finds this navel-gazing worthwhile.
I realized I have some questions: can you use BNF to describe the well-formed formulae in a language that has infinitely many function symbols of each arity 0,1,2,3,... ? Or can it only handle something less, like finitely (or infinitely) many function symbols of some finite collection of arities, which is already enough for the mathematics I do?
Can you use it to define the somewhat annoying rules governing free and bound variables for -conversion in the -calculus? (Maybe de Bruijn indices do better in this respect?)
These are questions I should have asked back when I was a kid trying to strip math down to some extremely unobjectionable syntactic constructions. I don't think I knew about "context-free grammars" but now I see that BNF seems able to describe context-free grammars and nothing more?
Sorry, I misremembered what BNF is (and deleted my comment while you were responding to it).
Okay - I changed my reply so that I didn't write "Thanks, I didn't know that!" after Kevin's remark. :smirk:
I was thinking of an analogous-looking definition of ASTs as an inductive datatype like
data term :=
| var : variable -> term
| app : function * term list -> term
which can be written in a BNF-like style (compiling out the notion of "list") as
<term> ::= <variable> | <function> <termlist>
<termlist> ::= <empty> | <term> <termlist>
but apparently what's called "BNF" is actually defining only a set of strings of symbols rather than an actual inductive datatype, i.e. it's a syntax for "recognizing languages" like, as you say, a context-free grammar.
However, that means that by using what's officially called "BNF" you haven't gotten any simpler than "the set of all finite sequences of elements of a set of symbols", since what BNF is officially doing is characterizing a subset of that set of finite sequences. I would argue that your intuitive description of variables
Want to know what a variable is? Okay: x is a variable, and if you have a variable you can get another variable by writing a ' after it.
is really more of a description of an inductive datatype, and we could define ASTs similarly by saying
"Want to know what a term is? Okay: any variable is a term, and if you have a function symbol and a list of terms you can get another term by combining them with parentheses."
Here's a question to disentangle the two interpretations. Suppose I say
"Want to know what an additive expression is? Okay: any variable is an additive expression, and if you have two additive expressions you can get another additive expression by putting them together with a + in between."
Given this definition, would you consider to be an "additive expression", since it can be obtained by first putting and together with a + in between, and then putting and together with a + in between (and also by first putting and together with a + in between, and then putting and together with a + in between)? Or would you consider instead and to be additive expressions?
I guess I'd take the first interpretation, taking to be an additive expression, since
"Want to know what an additive expression is? Okay: any variable is an additive expression, and if you have two additive expressions you can get another additive expression by putting them together with a + in between."
doesn't mention anything about parentheses. At least this is what I might do I were in a very syntactic mood, as I appear to be now. If I were in my ordinary mathematical mood I might stick in parentheses, but I also probably wouldn't say things like the quoted phrase.
I think in my childhood search for maximum certainty I should have learned about the Chomsky hierarchy and wondered how high up this hierarchy we need to climb, to get a formalism powerful enough to describe the set of provable sentences in some version of first-order logic. I liked the idea that ultimately to do math you just needed to understand some purely syntactic rules for manipulating strings of symbols, so I was curious about what exactly is a "rule for manipulating strings of symbols". I realized that a rule like "if the string is a true sentence, then..." is not purely syntactic (unless you have a syntactic defintion of truth!), but I would have loved learning about the Chomsky hierarchy.
Now I'm much less excited by all this stuff, and I'm mainly bringing it up because I'm deeply puzzled by what @Julius Hamilton is up to in his thoughts about first-order logic, like this:
Julius Hamilton said:
For some reason, I didn't like the common way I saw Terms inductively defined in first-order logic, which is sometimes like this:
If is a function symbol of arity , and are terms, then is a term.
I am pretty sure that since we said that is a function symbol, the notation "" is not defined in this context. is an element of a set.
I don't understand what he meant here. But more fundamentally, tracking the various questions he's been raising for the last few months, I don't get what he's trying to do.
It may have little to do with what I was trying to do (strip math down to minimal assumptions), but it reminds me a bit of that.
Okay. Well, I would argue that the interpretation with parentheses is a better one to be in the habit of doing nowadays, and doesn't require any additional expressive power. In particular, I think this version is better because with that definition you can define an operation on additive expressions by saying "if it's a variable, do this; but if it's a combination of two expressions with a +, do that". With the non-parenthesized version, trying to do that you run into the problem that, as with my example, a given expression could be obtained from a + in multiple different ways and it's not clear that all the resulting "that"s will agree.
in compiler land, if the parenthesis matter, we would call that "concrete syntax"; if they don't, we would call that "abstract syntax". Concrete syntax only "matters" if we are interested in parsing and printing abstract syntax into strings, where concrete defines things like which strings yield ambiguous parse trees. I guess the question here is if it is good to cover concrete syntax before or after abstract syntax, or at all- maybe it confuses things when one isn't interested in parsing. FWIW, some treatments of logic pre-date the invention of context free grammars and their notations (BNF).
Yes -- my argument is that dealing with concrete syntax is a distraction from understanding the important ideas when first learning logic. (Or, really, at any time except when one is coding a parser or a pretty-printer.)
When I was a kid I wanted to know how everything worked, including those parentheses you see all over the place. Now that I kinda know, it seems like no big deal. So if you're teaching logic to someone, one question is whether you're teaching it like an ordinary math course, where all sorts of things go unspoken, or whether you're trying to lay out precise rules that govern the manipulation of syntax. I'm not making any claims about what approach is best or what I'd do. I have never taught logic and I hope I never have to.
Nowadays it's probably common to describe "precise rules that govern the manipulation of syntax" in a computer science course, but when I was learning logic there weren't so many computer science courses - this would have been around 1978, when I was in high school. All the "persnickety" aspects of logic were still addressed in the context of writing symbol strings on paper.
Just to re-state some of this for Julius , we would more rigorously say "If f is a function symbol of arity n, and t1,…,tn are abstract syntax trees, then the tree with f as root node and t1,…,tn as sub-trees is an abstract syntax tree."
@Kevin Carlson All that is more about what we actually formelle do with our syntax, more than what we prétend to do by conceptualizing it in our minds with sense :stuck_out_tongue:
John Baez said:
I realized I have some questions: can you use BNF to describe the well-formed formulae in a language that has infinitely many function symbols of each arity 0,1,2,3,...?
Yes you can: for each arity, those function symbols just like any other ones may only be infinitely many if they're recursively enumerated or indexed by a r.e. index (like Peano numerals for instance: ), just like your examples with variables, then you can use the metavariable(s) of this peculiar recursive enumeration of the BNF grammar to form the terms they do :)
Can you use it to define the somewhat annoying rules governing free and bound variables for -conversion in the -calculus? (Maybe de Bruijn indices do better in this respect?)
For that on the other hand, I guess you need contextual grammars, indeed! IIRC (that remains meta, though, not needing abstractions like sets or smth)
Mike Shulman said:
However, that means that by using what's officially called "BNF" you haven't gotten any simpler than "the set of all finite sequences of elements of a set of symbols", since what BNF is officially doing is characterizing a subset of that set of finite sequences.
Simplicity wasn’t the targeted point though, that was only about the need of ontological commitment to abstraction before concreteness too, to math before metamath (and then to circularity), or not :) (which, I guess may be more or less pedagogical than smth else depending on one’s way of understanding)
Btw indeed there’s a bit more than enumerations of strings in the BNF when used in metamath: it’s also for their recursivity when they enumerate potentially infinitely many ones and then, for naturally (meta)proving stuff on all of them by structural induction principle (in addition of case analysis) :blush:
Oh, also, good point for the natural but perhaps unwanted associativity of infix binary formers (e.g. in BNF): that’s why prefix (or dually, postfix, depending on the directionality of one’s fav natural language) notations are usually preferred in metamath! They also avoid adding parentheses even for priority between different ones :smile:
What I mean is:
Let be a set, whose elements we call function symbols.
Let be a function .
Let be a set, whose elements we call terms.
The set of terms is closed under the association of function symbols with -tuples of terms, where is the corresponding arity.
So, I'm pretty sure we can say: .
However, it's possible we can make this nicer, because I think the disjoint union might be the same, though I haven't verified this:
.
I don't consider myself to be struggling, but rather thriving. I am completely content with how my everyday practice of investigating mathematical questions is going. I always do appreciate corrections and suggestions, though.
Sure, the disjoint union equation is very close to one way of describing the set of terms: you can read your equation as saying that every term is uniquely expressible as a function symbol together with a family of terms indexed by the arity of The main problem is that you need a base case, some terms you can start out with, say, a set of variables. So I might rather say a term is either a variable or a function symbol "applied to" a family of terms indexed by its arity.
The less practically important but more theoretically interesting problem is, in what sense does such an equation, where appears on both sides, define the set of terms? Consider the simple special case one variable, and one function symbol, of binary arity. There's lots to say here but perhaps it's a good moment to pause and let you think about, or decide whether you even care about thinking about, what it might mean to you to "solve this equation."
I'll restrain myself from blurting out the answer, but I will add that I think you probably should care about thinking about that. (-:
I don't consider myself to be struggling, but rather thriving.
Okay, sorry - your new clarification reveals to me that you're concerned with completely different issues than I thought from first reading. You are, now at least, happy to use set theory to study terms in first-order logic, even though set theory is often defined in terms of first-order logic, so at a certain stage one wants a way of understanding terms that doesn't rely on set theory. That's okay: you're at some other stage now.
Even though we're at some other stage now, I want to reiterate my disagreement that it is possible to understand logic without relying on an ambient set-theory-like mathematics. It's true that you can use a weaker metatheory for this than the commonly used set theories like ZFC, but in order to formally define and do mathematics with "terms" you have to have at least some of the basic tools of mathematics already to hand already. This isn't circular: mathematical logic studies a model or representation of the "real mathematics" that human mathematicians do, just as mathematical physics studies a model or representation of the "real physics" that happens out there with tables and electrons.
Relatedly, the JP Mayberry book I linked above has as its basic thesis that sets are by no means defined using first order logic, but that set theory qua foundation is based on a non-formal system of statements that are posited as “really true” in any sense of that phrase that leaves you feeling epistemologically kosher. And then indeed syntax and then the first order theory of sets—a useful mathematical model of set theory—and all other formally axiomatic math can be built on that foundation, to whatever extent you’re willing to take it as solid. So it’s at least something less than a universally accepted fact that the first-order theory of sets constitutes a definition of set theory.
Maybe I expressed myself too informally for this crowd. By "set theory is often defined in terms of first-order logic" I simply meant that the axioms of ZFC (say) are written in first-order logic. Obviously there's no definition of "set" in this approach. The properties we want sets to have are laid out in the axioms.
We need set theory to study the model theory of first-order logic, but I wasn't talking about model theory at all.
Hmm, I think there’s a more interesting disagreement here than just about model theory versus proof theory, though it’s fun if there’s even disagreement about whether there’s disagreement.
I disagree with your claim that it's fun.
Hopefully this isn't too much of an aside, but I have a question.
Logic was once explained to me as "what can be proven about mathematics with formal languages" which I interpret as a sort of foil to Euclidean geometry, which you could say is something like "what can be proven about mathematics with figures of lines, points and circles".
Logic as "studying a model of how mathematicians do math" seems like an over reach to me, because mathematicians do math in lots of different ways. But I'm assuming I'm misunderstanding something.
The later definition at least sounds much more exciting XD, what simple definition of Logic would you use?
Mike Shulman said:
Even though we're at some other stage now, I want to reiterate my disagreement that it is possible to understand logic without relying on an ambient set-theory-like mathematics.
I want to make sure I fully understand what you mean, so I would appreciate some additional clarification.
The weakest reading of your statement that I could muster, that one must have some kind of mathematics before using mathematics to study logic seems almost trivially true. Of course, to study logic using X, we need to have some access to X. If this is what you meant, then I don't think anyone here would disagree.
However, your phrasing, especially your qualifier (set-theory-like) makes me suspect you intend to make a stronger claim. One possible interpretation is that a person must already be familiar with a mathematical framework resembling set theory in order to understand logic in any meaningful way: whoever thinks they understood logic without knowing set theory is at best mistaken. I struggle to see how this could hold. It seems to contradict at least:
Anyway, I don't wish to argue further against this, I doubt this is your actual position. I guess what I'm struggling with is your phrasing set-theory-like mathematics.
Are you saying that if someone understands something like projective geometry deeply, they still couldn't use that as a basis to formalize syntax and start to represent and study what mathematicians do? How does one arrive at this conclusion? Or, on the contrary, is projective geometry also "set-theory-like mathematics"? Is number theory (which is demonstrably sufficient to develop a formal analogue of term, formula, etc. using unique factorization)? If so, then what exactly does that term exclude?
(note: much like John, I don't find this an especially fun subject to debate. But I've seen too many cases where claims like this, left unquestioned, later got cited by others as if they were settled consensus, sometimes even as "logic experts agree" or "authoritative sources". So I consider it important to get some clarity on what's actually being claimed while the discussion is happening)
Kevin Carlson said:
The main problem is that you need a base case, some terms you can start out with, say, a set of variables. So I might rather say a term is either a variable or a function symbol "applied to" a family of terms indexed by its arity.
The base case happens with nullard function symbols, aka constant ones. Though if you want variables indeed, either you add them, or you upgrade to higher order, in order to be able to choose to use only variable symbols even for functions :blush:
Mike Shulman A bit like Zoltan, I’m not exactly sure what you mean. You don’t need mathematics to do metamathematical Logic in that you don’t need abstractions or so, because you’re only dealing with concrete strings of signs; or, ok, at least you need to consider identical duplication of signs to be possible. Though yes indeed you need techniques that can be considered mathematical, partly because no precise demarcation was done at the beginning of those studies… But anyway, doing metamathematics about syntax and mathematical formalism, remains distinct of doing mathematical logic, encoded in some theory of abstract objects like sets or others (and then generalized at our leisure, inside it), that only mimick actual syntax, with no actual certitude of soundness of the encoding and theorems as for actual syntactic stuff. :man_shrugging: