You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
I just got to the part of my logic book where they discuss sequent calculus and I’d like to discuss it from different angles.
My first question is about if there is an interesting structural relationship between formula formation rules and sequent calculus derivation rules.
The rules for forming formulas in my book are that atomic formulas are of the form or , and compound formulas are formulas with logical connectives or quantifiers applied to them.
The rules for the sequent calculus in my book are slightly more complicated, but the book says they can be grouped into structural rules, quantifier rules, logical rules, and equality rules.
I would like to describe both of these systems of rules in a purely algebraic way so I can compare their structure to each other.
Sequent calculus uses a format that is new to me; strings of formula symbols indicate that all but the last symbol are assumptions (“antecedents”) and the last symbol is a claim (“succedent”), and these are called “sequents”. Then you can stack sequents vertically it appears, with a line dividing premises from a conclusion. I am curious if the vertical direction should be read identical to the horizontal, in other words, the sequents above the line are assumptions, and the sequent below the line would be a claim that follows from the assumptions.
A8BF157A-C947-4BE7-B624-61ECF84533B2.jpg
Can one write sequent calculus rules where sequents are elements of a set, and the dividing line in the sequent calculus notation is taken as an operation associating certain input sequents with output sequents? What kind of algebraic structure would it be (for example, is it a total or partial operation?) Thanks.
Yes, the "dividing line" of sequent calculus may be read as a consequence: if you can prove the sequents above it, then you can prove the sequent below it. Each sequent calculus rule (that is, each "kind" of dividing line) is a relation on sequents, not an operation (if by "operation" you mean "function").
Sequents are usually written where are formulas (the book you're reading omits the symbol, which is usually read "thesis"). The intuitive meaning of such a sequent is: if all hold, then holds.
You may confirm this intuition by proving that, in sequent calculus, is provable iff is provable.
So both the thesis symbol and the horizontal lines are sort of "meta-implications". They allow to express provability structurally, that is, the provability of a compound formula is expressed in terms of the provability of its components.
For example, is provable if is provable and is provable (this is actually an iff). Similarly, is provable if is provable or is provable (this is not an iff in classical logic, but it is an iff in intuitionistic logic). The necessity of appears when you analyze the provability of implication : you can't characterize it in terms of provability of and/or , because it is "contravariant" in . So you introduce the possibility of decomposing as , and now you can start reasoning on provability of under the hypothesis .
(This is all very rough, I'm just trying to give you the intuition, I hope it's making sense...).
Thanks.
I want to try coming to an understanding of something by formulating small, falsifiable claims and then proving or disproving them.
First I need to define sequent calculus formally.
I think the only way I know how to do this is using first-order logic.
Should the ‘thesis’ relation and the horizontal line be considered two different relations?
I’ll explore the case where the answer is ‘yes’.
Then I think the signature of this theory is .
I read online that in sequent calculus, you can have finite symbols on both the LHS and RHS of , where the LHS formulas are understood to be connected with , and the RHS formulas with . But in my book (so far), it looks like the RHS is always a single formula.
I can think of another two cases for how to write this signature. I think could be a function symbol, if it were seen as a closed operation where “” is a formula; versus when it is a relation which I think evaluates to “True” or “False”. I will work with the second option, because I think we do not want to mix meta-language and object-language, but I’m not sure.
I think if this is our signature, we should next consider what the elements of the domain are, and what the axioms of the theory should be. I think the elements of the domain are formulas. I wonder if a theory with two relation symbols can be separated into two theories, where the elements for the first theory are formulas under the relation , and the elements for the second theory are sequents under the relation .
I think that now, in trying to write axioms for these relations, we can write the rules of sequent calculus as they were presented in the textbook. I think we need different symbols for the formulas of the signature I defined, versus formulas in the object language. I’ll refer to formulas in the object language as .
I now revise my above signature. I think the rules of sequent calculus make use of sets of formulas (denoted by in my text), so I think we should have a “set membership” relation symbol in the signature, . I also wonder if first-order logic allows you to state that the ‘’ relation can have any “-arity”. But now I think we don’t need that, because we can make it binary but where the first argument is a set of formulas. I think this allows us to express tautologies in sequent calculus, where the set of premises is empty.
Here’s an example of trying to write the rules of sequent calculus as formulas in the above language:
.
This would be my attempt at writing the “assumption rule”:
I’ll continue this exploration soon. Please let me know if you or anyone has any critiques of this approach. Thank you!
(P.S.: As far as I can tell, you never avoid circularity in formal logic. You need sets to define languages, languages to define theories, theories to define structures, structures to define rules, rules to define everything, etc. I just feel like mentioning that in the time I have been studying logic, I feel like ‘coherentism’ is actually the mainstream assumption in modern logic, but for some reason, I feel like people don’t always say it that explicitly. By coherentism, I mean the idea that ‘justification’ arises bidirectionally, so there is no “starting point”, more a loop of A justifies B and B justifies A.)
Peeking ahead, after writing these formulas, I want to explore what theorems can be proved from them - using sequent calculus! But a sequent calculus “one level up”.
After studying the rules of sequent calculus, I felt like some of them are intuitive and will be pretty easily remembered when they are needed. I felt like the axioms that stood out more as codifying aspects of reason might be modus ponens, proof by cases, and contradiction. (I think the chain rule would be important, but I think it is intuitive, and I also think it is also called the “cut rule”, and I think Gentzen proved that some logics don’t need this rule.) I think I read that in some simpler deductive systems like Hilbert-style deductive systems, modus ponens might be one of the only axioms. I think proof by cases is comparable to “independence proofs”, of which there are some famous ones. And I think the contradiction rule is significant since it can be rejected, resulting in intuitionistic logic. I mean to say that I think some of the rules of sequent calculus seem more “significant” than rules like the assumption rule or the substitution rule for equality, which seem less interesting (but I might be wrong).
From that perspective, I think maybe in looking for proofs that can be made from the rules of sequent calculus, a good place to start is to see if you can use modus ponens on any of the axioms, since modus ponens seems so essential to me as reflecting the nature of making inferences.
Here are the rules. I think I can write two copies of the rules, where the object language uses the symbols for its formulas, and the meta-language uses the symbols for its formulas.
I think it is important to note that the sequents above the horizontal line are a sequence, which is why they’re enclosed in parentheses; but the collection of formulas on the LHS of are a set (I think).
The Antecedent Rule:
The Assumption Rule:
The Proof by Cases Rule:
The Contradiction Rule:
The Conjunction-Antecedent Rule:
The Conjunction-Succedent Rule:
(more coming soon - 4 more rules to go.)
After that, the idea is to consider the theory , the collection of all statements above. Maybe these can be some “tips” on a “proof strategy”:
The Antecedent Rule is (on the “top level”) an implication statement. Since I think Modus Ponens is valid in sequent calculus and can be derived from its axioms, I think I can ask myself if there is any set of formula which is a subset of . Of course there are. All subsets of meet this condition. And then I think I can ask, is there any formula which is known to be provable from any of those subsets? We can use the Assumption Rule! If a formula is a member of a set of formula, then that set of formula can prove that formula. By using these two rules, we prove that any collection of axioms or any individual axiom is proven.
I think we can write this like this, but I’m not sure:
.
This might be tautological, unnecessary, or even invalid, but I think it’s a first exploration of using sequent calculus to prove things about sequent calculus.
I think there is a categorical way to describe the structure of sequent calculus applied to itself that I’d like to work out.
I think a strategy you could use to explore if you’ve tried all possible ways of deriving new formulas from the axioms is by simplifying their notation to focus on inputs and outputs.
For example, requires the specification of two sets of formulas and , and a formula . I think if the sequent calculus in my book is presented as having 10 rules, that means there are 10 choices for ; the number of subsets of choices for ; and the same for (which I think is ). I think this means that we can compute every possible choice. I think some choices will not produce any theorems, because the condition of the conditional statement won’t be met; I think some choices will return formulas we already know to be true; and I think some choices will return new formulas. I would like to explore specifying such an algorithm.
Julius Hamilton said:
Thanks.
.Should the ‘thesis’ relation and the horizontal line be considered two different relations?
Yes. The horizontal line relates sequents. The thing above the line and the thing below the line can both contain . A proof on the sequent calculus is something like a tree growing upwards. The leaves at the top are the starting sequents, which are tautologies such as or axioms; the root at the bottom is the final conclusion sequent
. The horizontal edges can be thought of as edges which join nodes in the tree. (The edges can have multiple sources so it is more like a multi category or a monoidal category)
Julius Hamilton said:
(P.S.: As far as I can tell, you never avoid circularity in formal logic. You need sets to define languages, languages to define theories, theories to define structures, structures to define rules, rules to define everything, etc. I just feel like mentioning that in the time I have been studying logic, I feel like ‘coherentism’ is actually the mainstream assumption in modern logic, but for some reason, I feel like people don’t always say it that explicitly. By coherentism, I mean the idea that ‘justification’ arises bidirectionally, so there is no “starting point”, more a loop of A justifies B and B justifies A.)
"formal logic" is a bit of a broad category.
In model theory, which is a branch of mathematics and is not a part of foundations, the full strength of ZFC set theory is deployed to give precise definitions for languages, theories and models. But there is no circularity here as model theory is not required to define ZFC or first order logic.
In an introductory text on formal logic or set theory where the purpose is just to introduce these concepts precisely to an audience which is not familiar with them you might take as given certain basic terms whose meaning is understood informally. For example "set" might just mean what it usually means informally, some collection. Here the point is not to avoid circularity it is just to help the reader understand things. In such a book we should expect that fine and subtle properties of sets are not used as this would require formal study of sets themselves to be rigorous.
But first order logic itself when considered with finite languages and finite or recursively enumerable sets of axioms is not circular. One may need notions of set or list but here we are only talking about finite sets and lists and we do not need to rely on a sophisticated theory of potentially infinite lists.
As a piece of evidence that formal logic is not circular consider that theorem provers for first order logic actually exist in the real world and you can use them to derive theorems. Therefore the definitions of first order logic that are given in a textbook are sufficiently clear and unambiguous that they can be shed as a reference to actually implement these things on a computer. We have programs that can check whether the proofs are correct.
I remember first learning sequent calculus and being amused and somewhat appalled at how there are three levels of 'implication' at work: we can write formulae like
and also sequents like
and also inferences, or maybe they're called deductions, like
Btw, @Julius Hamilton, I like how you're now indicating when you're not sure of something. This makes it easier for people to jump in and explain things.
Can anyone explain why three levels of implication are used? If multiple levels are needed then I could understand having infinitely many, but exactly three has always seemed odd to me.
Lewis Carroll's "What the Tortoise Said to Achilles" is about roughly this question
https://en.wikipedia.org/wiki/What_the_Tortoise_Said_to_Achilles
Oh interesting. I'm very familiar with that story, so I will go and meditate on how it relates to the three levels.
I too wonder whether and why 3 levels is optimal, no more and no less. Of course if your "object language" were insufficiently expressive to talk about implication you'd strip off the bottom level. And if you used Hilbert-style deduction instead of natural deduction you could strip off the middle one. You need one level just to express the idea of deduction rules.
can be seen as an internal-hom, is to indicate the domain and codomain of the morphism. And the vertical line corresponds to the use of a universal property, a functor or a post/precomposition by something.
In fact there is also somehow a fourth level which is very important in categorical semantics: the one of rewriting / cut elimination. You give rules of the form where are two proofs with same conclusion. The goal is to eliminate the uses of the cut rule. It also introduces the computational content of proof theory, and the coherence diagrams in categorical semantics.
Good point!
3 levels is merely a design choice! There are many "Substructural Logics" available, that e.g. disallow rules such as "the order of assumptions doesn't matter" https://en.wikipedia.org/wiki/Substructural_logic . Informally it seems like these logics can have both fewer or more 'levels'. Then the fun begins in trying to find the corresponding categorical semantics, as Jean-Baptiste was alluding to.
Back here Patrick wrote
The horizontal edges can be thought of as edges which join nodes in the tree.
I'd put it differently. The trees here are finite rooted planar trees, with edges oriented toward the root. The edges of the rooted deduction tree are labeled by sequents, and the non-root nodes of the deduction tree are labeled either by an axiom rule (at the top of the tree, i.e., if the node is a leaf) or a rule of inference (all other non-root nodes). So the nodes correspond to those horizontal bars, where the premises of the rule application are the sequents appearing above the bar; these sequents label the incoming edges of the node. The conclusion of the rule application is the sequent appearing below the bar, and this sequent labels the outgoing edge of a labeled node.
Jean-Baptiste's comment is pointing to some very important stuff. At the very beginning is the distinction between the "arrows" (often called "material implication") and (often denoted also by , and which I often call an entailment symbol). When I teach logic to students coming at it for the first time, I try to explain the distinction between implication and entailment. If and are propositional expressions, then is another propositional expression. Here I'm using the word "expression" in the same way that is an algebraic expression. The expression isn't asserting anything; it's just a type of noun. Likewise by itself isn't supposed to be asserting anything; it's another sort of noun.
On the other hand, when I write , I pronounce this as " entails ", and this time I'm asserting something, I'm stating a whole sentence, that a certain relation holds between and , basically saying that if is true, then is also true. It's easy to get confused by these things. I say that is a mathematical noun. But if I say " is true" (i.e., the expression is a tautology), well now that's an assertion, a sentence, and it carries the same meaning as . So in other words, " entails " says the same thing as "the implication is true". It doesn't help that many or most mathematicians skip over these distinctions in their everyday talk, or use "implies" and "entails" interchangeably, but when we're slowing down and examining the use of mathematical language and logic carefully, it helps to maintain such distinctions.
I haven't gotten to the deeper levels of J.-B.'s comment, because it alludes in fact to decades of research, and it deserves a long (and mathematically loving) discussion. Some time.
I would say that the trio of levels of implication derives from the perhaps more fundamental distinction between meta-language, object-language and judgements.
In our object-language we have propositions. Judgements say what propositions entail other propositions. To talk about what judgements we can derive from other judgements we use implication in the meta-langauge. And finally, object-language implication internalises judgemental entailment, so now a proposition can express an implication and we can make judgements involving such propositions.
But maybe this is another way of saying what's been said above. Indeed the category-theoretic viewpoint that @Jean-Baptiste Vienney talked about corresponds to this. Morphisms in your category are judgments, internal homs are implication propositions (objects in the category corresponding to propositions) and meta-implication is determined by whatever meta-theory you are reasoning about your category in.
John Baez said:
Btw, Julius Hamilton, I like how you're now indicating when you're not sure of something. This makes it easier for people to jump in and explain things.
Thanks that was really good advice
In the signature of set theory I think we only need one symbol, the relation symbol ‘’.
I think that when we want to define functions like , we have to add a new symbol to the signature. I think this is called a “conservative extension”, because it produces a logically equivalent theory. I forget the formal definition of logically equivalent while writing this, but it might be “have the same models”.
But in order to keep to the formalism of FOL, I think we have to add these symbols into the signature, “from the beginning”. It seems like FOL doesn’t actually have a “rule” that lets you “define a new function later on”. It seems like all functions are understood to be stated in the signature at the beginning. It seems like a “definition” is equivalent to stating equality between two terms, but where one of the terms is “atomic” (I think). By atomic, I mean it is either a function symbol or constant symbol, and not a term formed inductively from the other term formation rules. Maybe I am mixing this up with formula formation rules.
I thought I was clear on this but now realize I’m not. Let’s say you want to define as . We haven’t defined set-builder notation yet. I was just thinking of ways to try to define it, but I think they were circular. I wanted to say “ means the set of all such that is true”. I think the ability to do this is what the axiom of specification allows. I need to study this now, but I’m going to guess that regarding the sequent calculus, set theory axioms which tell you what sets exists are written with the ‘’ operator, but I’m not sure.
You don't need to introduce a symbol into the language of set theory because introducing it and adding a new axiom that says x is in the union of A and B iff x is in A or x is in B doesn't let you say or prove anything really new.
First, this extension is conservative. This means that anything expressible in the original theory and provable in the new larger theory was already provable in the original theory.
But second, I think you can prove that anything expressible in the new larger theory is equivalent, in the larger theory, to something in the original theory. I don't know the technical term for this property.
Combined, these things say you "can't say or prove anything really new".
It seems like FOL doesn’t actually have a “rule” that lets you “define a new function later on”. It seems like all functions are understood to be stated in the signature at the beginning.
We can talk about one theory being an extension of another, and they don't need to have the same signature. I think this is able to handle the issues you're concerned with here.
It seems like a “definition” is equivalent to stating equality between two terms,
I don't think it works that way for the definition of "union" that I just gave. I instead gave an axiom saying what the members of a union are.
Thanks.
I just wrote out some stuff that I haven't shared yet, but I'm trying to formalize the rules of term-formation. The idea is to represent each distinct type of symbol as a set. For example, is the set of logical connective symbols. The syntax should be functions on different types of symbol. For example, is a unary function from the set of formulas to the set of formulas.
I am thinking the most about how to formalize the idea of the -arity of a function. If a function is 3-ary, a syntactic function has to say "for this function, the selection of any 3 terms is a term". Maybe you could think of this as . However, I need a general formula, where the arity is a variable. I suppose there is a symbol for this, . Maybe it could be expressed as a function which takes an arity, and returns a function of that arity. I need to think this over more, but I'm picturing an inductive definition in the style of type theory. The constants can be function types, . The function-generator could be like . I think I'm close. I think I need to define as a product type of natural numbers and function symbols (so symbols get tagged with arities). Then I need the function_term generator to "grab" the arity and pass it forward, perhaps with indexing, like is a way of accessing the arity.
You might want to look at this Wikipedia article, and at the textbooks given in the references.
I am thinking the most about how to formalize the idea of the -arity of a function.
It sounds like you're worrying over things that lie outside a given theory. I think of the signature and axioms of a theory as more or less a bare-bones set of instructions to the user who is going to be proving results in the theory, about how to form terms and formulas and so on, and these actions of course are not formal items within the theory; they are at a "meta" level. So when you state these concerns, it sounds like you are asking about how to formalize the general notion of (let's say for now first-order) theory, within a metalanguage (usually a weak set theory of some sort).
So for example, a signature consists of a "set" of formal sort symbols , a "set" of formal function symbols together with a "function" (where is the "set" of finite words using as alphabet), and a "set" of relation symbols together with a "function" . The and are arity functions. The words I put in quotation marks are linguistic items in the metalanguage.
Now if you want to know how to formalize, for example, the "set" of all terms generated by a signature within the metalanguage, that can be done. My own preference would be to do this using trees, following a variable-free approach as in the book by Tarski and Givant. But anyway, maybe have a look at Introduction to Metamathematics by Kleene (I hope that's a good reference).
I get a sense that maybe you're worried about an infinite regress of some kind; for example, wouldn't we need a meta-meta-language to formalize the operations used in the meta-language? I tried to summarize my own feelings on the matter in section 2 of this (still incomplete!) document. The most helpful thing for me is to consider the fact that when we encode the items of a theory into computer software, say, we never worry about encoding an infinite regress of metas: we just do the encoding and let the machine perform logical operations correctly by clever arrangement of logic gates, and let it churn out correct proofs. It doesn't need to house the entire infinite collection of correct deductions. And neither do human users. We just need to learn how to build correct proofs with our fingers and pencils and neurologies, which somehow or other we learn to do (I guess! for the most part anyway!).
(Of course, nothing stops us from contemplating the infinite set of correct deductions! But we don't have to do that to get on with the business of proving theorems as ordinary mathematicians.)
Thanks.
What is the function for?
I added that in. It's the arity function. If , that tells you that in an interpretation of the theory, one will interpret the symbols and as sets , and the function symbol as an actual function , from the set to the set .
That’s cool, I’ll think about that and get back to you.
Often you can think of the horizontal line as a natural or dinatural transformation between profunctors. It is dinatural if a symbol appears both on the left and on the right of the turnstyle, natural otherwise.
For example,
denotes a natural transformation
where
and
And for example,
denotes a dinatural transformation
where
and
Todd Trimble said:
I get a sense that maybe you're worried about an infinite regress of some kind; for example, wouldn't we need a meta-meta-language to formalize the operations used in the meta-language?
Yes, @Julius Hamilton is always grappling with this, which is amusing because it reminds me of myself when I was first learning logic. I wanted everything to be rigorous, so I would keep trying to take informal ideas in the metalanguage - the language the books were using to talk about logic - and make them be part of the language itself, so they'd be formal too.
This is a doomed enterprise. But if you're stubborn, there's probably no one particular point at which you're forced to quit: you can probably always keep taking whatever informal concepts you're using to talk about the formalized concepts, and try to formalize those too! The doomed nature of the enterprise only reveals itself slowly, as you notice your work is never done.
The doomed nature of the enterprise only reveals itself slowly, as you notice your work is never done.
Or as you notice that the work you're supposed to be doing is not getting done. ;-)
I was like this also when I started learning logic. Then, I became more interested in foundations (or maybe we should not call them foundations but just mathematical languages) where the language makes apparent as straightforwardly as possible some combinatorial/algebraic content. This is not just because of me but also because this is where the research was done in my city. And I was for sure influenced by a guy named Girard who basically invented what was studied here and was virulently defending the idea that logic must liberate itself from the "bureaucracy" of heavy formalism. I completely get it that you want to try being ultra-rigorous at first.
I read that you are/were interested in linguistic. Maybe you would like taking a look at how Wittgenstein changed his mind in his life. First, he wrote the Tractatus logico-philosophicus where he explained how he thought that understanding the world is the same as understanding a single formal logical system. Then, he changed his mind and in the later books, like the Blue and Brown Books or the Philosophical Investigations he thought it was more important to study the human language, in a very pragmatic way. Also interesting is that he became interested in local "language games" which interpenetrate each others rather than a single global language. All in all, he evolved in pretty much the same way than logic in mathematics/computer science.
I just say this maybe for later. But for now, have fun in your exploration of perfect formalization and -languages :).
(Also, this is very important to understand the classical sequent calculus first if you want to understand what was invented in logic then starting from this!)
When I was trying to make everything maximally rigorous I wanted to see how much I could reduce the assumptions I was using to do mathematics. It eventually became clear that repeatedly trying to take the metalanguage and move it into the formalism being discussed does not always help! And by reading the masters of logic it became clear that a good set of minimal assumptions is that you are able to manipulate finite strings of symbols from a finite alphabet using purely syntactic rules.
Of course, we need to understand what "purely syntactic rules" means. There's more to say about this than I'm able to easily say, but certainly the deduction rules of the sequent calculus for first-order logic count.
So, it doesn't pay to formalize the metalanguage in which you specify the rules of this sequent calculus if what you're seeking is "maximum rigor". That metalanguage will demand its own metalanguage, which will have no fewer ontological committments, ambiguities, etc.
When you're trying to dig to the very bottom of things in this way, and struggling to understand "purely syntactic rules", it can pay to read Wittgenstein's work on "following a rule" in his Philosophical Investigations. It's mind-blowing and for some infuriating, so I'm glad to see @Jean-Baptiste Vienney recommending it too!
The basic question here is whether we can ever know for sure what it means to "follow a rule", say a rule like "take a word and put a letter s at the end of it".
I studied logic with Kripke as an undergrad, and only read Philosophical Investigations as a grad student under the influence of my best friend, Chip (now Chelsea) Wendt, who was reading it for a class. So I wasn't in a position to notice that Kripke wrote a book on Wittgenstein's work on following a rule, called Wittgenstein on Rules and Private Language. He also finished it right when I graduated, in 1982.
I still haven't read Kripke's book! According to Wikipedia,
Wittgenstein on Rules and Private Language is a 1982 book by philosopher of language Saul Kripke in which he contends that the central argument of Ludwig Wittgenstein's Philosophical Investigations centers on a skeptical rule-following paradox that undermines the possibility of our ever following rules in our use of language. Kripke writes that this paradox is "the most radical and original skeptical problem that philosophy has seen to date" (p. 60). He argues that Wittgenstein does not reject the argument that leads to the rule-following paradox, but accepts it and offers a "skeptical solution" to alleviate the paradox's destructive effects.
Wow lucky.
Kripke invented modal logic didn’t he?
No, that couldn't be true, modal logic is old.
He did important work on modal logic though.
John Baez said:
I studied logic with Kripke as an undergrad, and only read Philosophical Investigations as a grad student under the influence of my best friend, Chip (now Chelsea) Wendt, who was reading it for a class. So I wasn't in a position to notice that Kripke wrote a book on Wittgenstein's work on following a rule, called *Wittgenstein on Rules and Private Language. He also finished it right when I graduated, in 1982.
I still haven't read Kripke's book! According to Wikipedia,
Wittgenstein on Rules and Private Language is a 1982 book by philosopher of language Saul Kripke in which he contends that the central argument of Ludwig Wittgenstein's Philosophical Investigations centers on a skeptical rule-following paradox that undermines the possibility of our ever following rules in our use of language. Kripke writes that this paradox is "the most radical and original skeptical problem that philosophy has seen to date" (p. 60). He argues that Wittgenstein does not reject the argument that leads to the rule-following paradox, but accepts it and offers a "skeptical solution" to alleviate the paradox's destructive effects.
Woh, I want to read this book now!
The basic question here is whether we can ever know for sure what it means to "follow a rule", say a rule like "take a word and put a letter s at the end of it".
This is indeed a fun and philosophically tricky question to think about. But I curious about how "it can pay"
When you're trying to dig to the very bottom of things in this way, and struggling to understand "purely syntactic rules", it can pay to read Wittgenstein's work on "following a rule" in his Philosophical Investigations.
in order to understand syntactic rules. Do you contend that reading this will help our friend Julius understand logic and proofs better?
I also wonder (but a subject for a different thread): how do you think Rota would respond to this particular investigation?
Todd Trimble said:
The basic question here is whether we can ever know for sure what it means to "follow a rule", say a rule like "take a word and put a letter s at the end of it".
This is indeed a fun and philosophically tricky question to think about. But I am curious about how "it can pay".
Maybe only in the way that sometimes you have to sink to the very bottom of doubt before it sinks in that no matter how much you doubt, the world doesn't end - indeed it marches on quite unconcerned - so you might as well relax a little.
But my point was that:
1) Julius has been worrying about questions like "when they say an alphabet is a set of symbols, does that mean I need to formalize set theory before I can do first-order logic?" No: to get started, we need very little. For example it suffices to know how to follow purely syntactic rules for producing new strings of symbols from old ones.
2) If at this point one is still worried, and wondering if we can ever be sure we are following such a rule, then maybe it's time to read what Wittgenstein and Kripke said about this - because they thought about this very hard already.
Point 1) is probably about 100 times more important for Julius right now than point 2).
Do you contend that reading this will help our friend Julius understand logic and proofs better?
Maybe not. It helped me understand logic and proofs better, but only after I aced courses on logic with Benacerraf and Kripke, so my problems were of a more philosophical than technical nature.
I can also vouch that reading Philosophical Investigations helped me understand logic and proofs better.
some comic relief about Wittgenstein's 'pivot': https://existentialcomics.com/comic/551
I agree with everything you just said, John. And there may be people out there who constitutionally have to go through these struggles and doubts, and for those people, working through Wittgenstein and Kripke et al. might be invaluable. But as a pragmatic matter, people learn how to recognize correct proofs and eventually learn how to write proofs themselves in largely the same way the world over, because (I think) the ability to do this is part of our natural human genetic endowment, and the sorts of thought experiments (mentioned in the Wikipedia article on Kripke's book) adduced to express some radical doubts about rule following are, well, fabricated, don't occur in practice, as I suppose everyone agrees.
Which is a long-winded way of agreeing: yes, try relaxing a little, you'll get the hang of it anyway if you keep working at it, and the craving for absolute philosophic certainty will likely abate, once you feel in your bones what mathematical certainty is like.
(I am reminded of this wonderful essay on MathOverflow by Andrej Bauer.)
the sorts of thought experiments (mentioned in the Wikipedia article on Kripke's book) adduced to express some radical doubts about rule following are, well, fabricated, don't occur in practice, as I suppose everyone agrees.
Wittgenstein also agreed with this, and I think his career shows someone slowly and rather painfully breaking away from the goal of absolute philosophic certainty and complete clarity. So for people hung up on certainty, it can be useful to read him. Of course, another approach is to never get bitten by the certainty bug in the first place!
The rise of computers puts another twist in the story, because now we tend to use them as the paragon of certainty, and we have a lot of mathematicians eager to achieve greater certainty by formalizing mathematics on computers. This doesn't excite me, so I just do my own thing.
Getting back to Julius's opening post:
My first question is about if there is an interesting structural relationship between formula formation rules and sequent calculus derivation rules.
I would say yes. This is pretty well masked in the fragment of the text that you reproduced, but in the presentations of sequent calculi that I've used, for each kind of primitive formula-building operation, like or , there will be two rules: one for how the operation gets introduced in the succedent (like S), and one for how the operation gets introduced among the antecedents (like A). From a more advanced perspective, the two rules together reflect an appropriate universal property satisfied by the built formula, when you interpret these formulas categorically. But I can't tell how useful it would be to embark on this point right now; it might be a case of putting the cart before the horse. So maybe just put a pin through it for now.
One thing that annoys me in the chart that you copied is that this "two rules principle" isn't well manifested in the treatment of negation . In fact it's hard to tell from that chart how a formula would ever get introduced in the first place. (I would rewrite Ctr so as to swap and , so that the conclusion of the rule looks like , and then at least you see how negated formulas get introduced on the right of the sequent.) It looks a little better in the Wikipedia article, except that they use the goofball convention that the flow of inferences moves up the deduction tree, as opposed to the down direction that everybody else (that I've seen) uses. I mean, the discussion of your book's treatment could devolve into something super-technical. It's likely that the author further explains how negation is handled in notes to their presentation, but frankly I think I've seen it handled better elsewhere.
If you ask me for advice, I would generally advise against trying to learn how to construct proofs by studying a text on sequent calculus. In my opinion, nobody but a logical technician with a technical purpose in mind would or should present ordinary proofs this way. It's neat to see how the process works, and phenomena like cut elimination are quite profound and indicate the real purpose of sequent calculus, but otherwise it's a somewhat cumbersome and unwieldy instrument.
On the other hand, how formulas are constructed -- which is how we started off the discussion -- can offer vital clues about how to go about proving they are true. For example, if you see a formula like , then a proof would typically be formatted like so: let be a particular but arbitrary element of the domain . Suppose is true. Then [some argument ending in] is true. For a mathematician, the proof would stop pretty much there, but for the relative beginner who needs to hear what just happened: the block of the proof that starts with "suppose is true" and ends with " is true" sits inside a proof box (a kind of hypothetical world in which one supposes is true) which when complete gives the proof of the material implication . Then, since was an arbitrary element, the block of the proof that starts with "let be a particular but arbitrary element", and ends with "" (that let clause also opens a kind of hypothetical world, another proof box which closes with the goal of establishing that implication), well that block establishes the formula , where is a variable.
I sincerely hope none of this sounds at all condescending. There's actually a fair bit going on here. In particular, calling a letter , which doesn't vary in any discernable way, a "variable", is the type of thing that can be very confusing. The maneuver from that variable to "particular but arbitrary element belonging to the stated domain " is crucial to appreciate in detail. For example: by homing in on "particular", we are allowed to speak of a specific truth value for , and this specificity lands us in good old propositional calculus. A good solid text aimed at undergraduates taking a proofs course will explain all such nuances in detail; I think we had a discussion here at this Zulip on what are some good textbooks here.
It looks a little better in the Wikipedia article, except that they use the goofball convention that the flow of inferences moves up the deduction tree
Oh, I see, I was looking at the Reduction Trees section, whereas the section "The system LK" is where the sequent calculus proper is first described. That makes better sense.
A good solid text aimed at undergraduates taking a proofs course will explain all such nuances in detail; I think we had a discussion here at this Zulip on what are some good textbooks here.
I think Mike Shulman said that he used Velleman's How To Prove It for his course and thought it was pretty good. Youtuber MathSorceror recommends Proofs: A Long-Form Mathemathematics Textbook by Jay Cummings.
I'll admit it took me a loooong time before I became competent at writing mathematics. (Proofs courses weren't much of a thing in my day.) I remember signing up for a graduate real analysis course well before I was ready, and I had no idea just how deficient my skills were until my first homework submissions came back. And I didn't want to admit how bad I was.
Thanks.
I can respond in detail soon.
I made a YouTube video of me reading a paper about the general semantics of type theories by Taichi uemura. I’ll share it when it finally finishes uploading…
This might sound very random, but there is a hilarious thing on the internet right now where an actor called Terence Howard is discussing some mathematical and scientific ideas with a mathematical physicist called Eric Weinstein.
I personally think it’s actually kind of interesting to think about why he thinks what he does.
He is very intent on the idea that “1 x 1 cannot equal 1, it must equal 2”.
I was thinking about if we could provide any algebraic structure that would actually have some of the properties he wants.
He says he thinks 2 x 2 = 4, as well.
It sort of seems like he wants to point out the sequence: , or powers of two.
If we defined “” to mean , we would say , , , for example. But he wouldn’t accept .
Just hypothetically. I wonder if there is an algebraic structure where first of all there is no identity element. (Semigroups, Rngs). (Howard seems to be averse to identity elements, he doesn’t like 1 as multiplicative identity, and he says he doesn’t believe 0 exists).
Second of all let’s say it’s countably infinite since presumably his envisioned number system is similar to integers.
At this point for me it becomes a genuine math exercise.
I think the integers under multiplication form a commutative monoid.
I think we could prove this by defining natural numbers either with a simple set-theoretic successor function where , which I think are called the “Zermelo ordinals”, or we could use the Von Neumann ordinals, which I think are generated with . Someone told me the zermelo ordinals are acceptable as a “rank function” but I think the use of the von Neumann ordinals is that if you view relation as a way of ‘ordering’ sets, Zermelo’s function induces I think maybe a very weak partial order? I think for the set of elements generated by Zermelo’s function, is not a transitive relation. For example, , and , but . So I wonder if you require Von Neumann’s definition to define addition.
I wonder if you can figure out a good definition for addition by considering what properties it should have. Maybe category theory could help us there. I wonder if the universal property of the coproduct tells us what we want addition on sets to be like.
I can explore this intuitively. We want a function on sets. You take two sets and it returns a new set. The two inputs should always be less than the output, unless one of the inputs is zero. I wonder, what if this condition alone is enough to define the structure of addition, or what if we need more conditions?
I think if I can prove the set of Von Neumann ordinals exists, I can define a binary function on that set: .
I think we make a ‘conservative extension by definition’. We add to the set of function symbols. I think we’ll assume we also have an empty-set symbol in our set of constant symbols. We add axioms:
.
I would like to explore the properties of under just this definition later.
Addition is commutative:
.
Addition is associative:
I will try to write this out fully later, but I’m asking myself about the necessary and sufficient axioms to determine addition as we know it.
Then the idea is an exploration of more algebraic structures. “Howard’s operator” I think is monotonic, it has no identity element, it possibly has a single generator… would be cool to identify such an operator or prove it would be self-inconsistent.
So I wonder if you require Von Neumann’s definition to define addition.
No.
However, there is such a thing as a [[natural numbers object]] in the sense of Lawvere, which has a similar flavor in that it's a universal object with a "successor" operation and a starting point (which you can call whatever you like; most people call it "zero", but if you're allergic to zero, call it something else then). And you can define addition and prove all the usual properties, based on the universal property. (I'm telling a little white lie which the cognoscenti will recognize immediately, but let me not worry about it.)
Relatedly, there's something called Peano arithmetic. Lawvere's definition is closer to Peano's original idea, which is a second-order logic idea. Most people, referring to Peano arithmetic, study the first-order version (which has addition and multiplication operations baked in from the outset), but I actually think it's worthwhile playing with the second-order definition first. The Wikipedia article discusses all these things.
John Baez said:
I studied logic with Kripke as an undergrad, and only read Philosophical Investigations as a grad student under the influence of my best friend, Chip (now Chelsea) Wendt, who was reading it for a class. So I wasn't in a position to notice that Kripke wrote a book on Wittgenstein's work on following a rule, called Wittgenstein on Rules and Private Language. He also finished it right when I graduated, in 1982.
I still haven't read Kripke's book! According to Wikipedia,
Wittgenstein on Rules and Private Language is a 1982 book by philosopher of language Saul Kripke in which he contends that the central argument of Ludwig Wittgenstein's Philosophical Investigations centers on a skeptical rule-following paradox that undermines the possibility of our ever following rules in our use of language. Kripke writes that this paradox is "the most radical and original skeptical problem that philosophy has seen to date" (p. 60). He argues that Wittgenstein does not reject the argument that leads to the rule-following paradox, but accepts it and offers a "skeptical solution" to alleviate the paradox's destructive effects.
on a closer look, wittgenstein's radical skepticism was what constructivists would call ultrafinitism. the problem of incomplete inductive definitions was solved by goedel. e.g., the solution of the "paradox" of addition is that, although i have only ever performed a finite number of additions, i can write down a definition of addition that will completely determine the sum of any pair of numbers. although i can only write down finite specifications, i am able to specify infinite operations. eg the recursion schema is a finite specification of infinite functions. after almost 90 years of programming, we know that finite specifications of infinite operations are not such a big deal as wittgenstein's paradox makes them.
alan turing sat in wittgenstein's seminar and explained that to wittgenstein. wittgenstein was also claiming that the whole hilbertian imperative of sound and complete theories was a fiction, since unsound and inconsistent theories are not a problem if you don't stumble on the inconsistency. turing was pointing out that people sort of care to avoid falling off the cliff down the road and have developed methods to avoid it... wittgenstein is interesting to read, but he swung from trying to do logic in natural language, to being frustrated that that didn't work and then rejecting all logic. which also doesn't really work, although it writes fun prose.
(jeez, this genius G keeps correcting "turning" to "turning". <--- the first instance in "" was the name of a logician, but it got corrected. i am so appreciative, G.)
I happen to have bumped into a new paper on the arXiv, on the sequent calculus:
The sequent calculus is a proof system which was designed as a more symmetric alternative to natural deduction. The {\lambda}{\mu}{\mu}-calculus is a term assignment system for the sequent calculus and a great foundation for compiler intermediate languages due to its first-class representation of evaluation contexts. Unfortunately, only experts of the sequent calculus can appreciate its beauty. To remedy this, we present the first introduction to the {\lambda}{\mu}{\mu}-calculus which is not directed at type theorists or logicians but at compiler hackers and programming-language enthusiasts. We do this by writing a compiler from a small but interesting surface language to the {\lambda}{\mu}{\mu}-calculus as a compiler intermediate language.
Emphasis mine. I don't agree but I found it funny.
:laughing: I think they meant "only experts of the sequent calculus can appreciate the beauty of ". As opposed to "only experts of the sequent calculus can appreciate the beauty of the sequent calculus". (Or maybe you don't agree with either statement?)
Okay, so maybe they don't believe in using "it" to refer to the last noun used that could make sense when substituting it into "it". In that case I'm not sure I trust them to teach me the calculus. :smirk:
(I don't know anything about the calculus so I have no opinions about who can appreciate its beauty.)
I just glanced at the paper, but I did not appreciate its beauty.
Sometimes I see the turnstiles are indexed by some parameter, i.e. . Why are the turnstiles indexed in sequent calculus, as opposed to putting the statement or on the left side of the sequent?
These indexes correspond to the context of (first-order, sorted) variables employed in the terms appearing in the formulas to the left and right of the sequent. An alternative presentation could use dual contexts (for instance ) instead of the indexed form .
I’m tired and allowing myself to be a little lazy. These are some questions I am working on as I try to have a better grasp on logic and also set theory.
I think that expression will give us the set of all applications of a function to a set of terms; but then we would need to express this as a recursive function, and express that we can re-apply the same function to the set of terms ad infinitum, as we keep adding terms.
After I think this through I want to do the same for formula-formation and for sequent calculus: to write an expression for how to generate all the results of a system of rules, basically.
I think this could be a clue to help me: https://ncatlab.org/nlab/show/initial+algebra+of+an+endofunctor
I’ll try to study this idea tomorrow.
Assuming the plus sign means the disjoint union. Here we have a set-theoretic definition of disjoint union:
The suggested endofunctor in the first picture sends each set to the disjoint union of that set with a singleton set. For example, . It occurs to me, that the disjoint union could be defined as a unary operator which tags each element with the set it came from. I wonder what we would call such an operation. It seems like a function from the elements of a set to the set containing that set. . There’s more to be thought about here.
An algebra for this endofunctor is a set X (called the carrier of the algebra) together with a function f : (1 + X) → X.
Here, I notice that the functor is from to , but the function for the algebra is defined as , the reverse direction.
I think what they do here is create an inductive definition “piecewise”, and then take the disjoint union / coproduct of the functions in the definition.
I feel like a lot of your questions could be answered simply by reading ahead in whatever text you're using. Or even just by consulting Wikipedia. You could start with Well-formed formula, for instance. No, it's not necessary for a wff to have all variables quantified over (those are called bound variables; the ones that aren't bound are called "free"). For example, if your language has a relation symbol in it, and denote variables, then is a well-formed formula. When you interpret this formula in the structure of real numbers, you get a certain subset of the real plane, namely the set of pairs of real numbers for which is true.
Mathematically, we could say the set of terms Terms is the union of Constants and Variables.
Why would you say that?? If are variables, the term is neither a constant nor a variable. You must be intending "union" in some idiosyncratic sense that no one will go along with.
Consider a function symbol f which is n-ary. The number of terms it can form can be expressed combinatorially - such as “m choose n”, where m is the number of terms we have so far, and n is the arity of f.
You're way off. The "m choose n" terminology is for something else: the number of choices of an n-element subset of an m-element set. For example, "n choose 2" = n(n-1)/2.
There are so many things that are off that it gets to be exhausting cataloguing them all, possibly because you're allowing yourself to be "lazy" as you said. One place where you're not off is the relevance of initial algebras of endofunctors. For example, the initial algebra of the endofunctor on the category of sets can be construed as the set of all constant terms for a language with a constant symbol and an unary function function :
aka .
But I think maybe now is not the time for a discussion of initial algebras, when you have so many basic bread-and-butter issues waiting to be resolved.
By the way, "formula-formation rules" doesn't seem bad to me. "Rules for defining wff's" could be another. I'm not even sure there's a universal according-to-Hoyle terminology here; different authors may use different phrases which pretty much amount to the same thing.
Thank you.
I feel like a lot of your questions could be answered simply by reading ahead in whatever text you're using.
I'm not trying to be argumentative, but my opinion on this is: in principle, I could just read texts and almost never ask questions in forums, but I ask questions to stimulate discussions and help me organize my thoughts, not because there is no other way for me to find the answer to my questions. I know that my first question is very simple and is easily findable in my textbook. But yesterday evening, I was very tired, but I wanted to try to do something mathematical before I went to sleep, and the result is that it caused the conversation we are having, which to me is a positive result.
The trouble though is that you write a big wall of text with a lot of things wrong in it, and it's tiresome to go through the whole list to point out all the errors. Maybe try to be more laser-like in focus with your questions? I don't know.
@Julius Hamilton For point 3, since (at least pre-)terms and (pre-)formulas are recusrsively generated by a non-contextual grammar, they are usually presented metamathematically presented in Backus-Naur form:
n::=0|sn (natural numerals to index potentially infinitely many variables)
x::=v_n
c::=your constants
t::=x|c|ftt’ if f is a binary function symbol (that is term endoformer), etc.
φ::=Rtt’t’’|(φΛφ’)|Qxφ etc. however your predicate/relation symbols (exoformer of formulæ from terms, R being ternary here), connectives (endoformers of formulae, Λ being binary) including quantifiers (mixed formers of formulae from variables, binding them, and formulae, Q being here unary in for both), or whatever else is possible :smile:
(For higher ordrer, add formula quantification, etc.)
@Todd Trimble
Mathematically, we could say the set of terms Terms is the union of Constants and Variables.
Why would you say that?? If x,y are variables, the term x+(y+1) is neither a constant nor a variable.
In my book, the authors define "terms" in natural language:
I meant to say that the set of terms should be the union of the set of constant symbols, the set of variable symbols, and the set of terms generated by applying functions to terms. I needed to think more about ways to express the way applying function symbols generates terms, so I began with a "temporary" definition which accounts for the constants and variables. I meant to imply that this set would then be augmented once I found a good formulation for the function-generated terms. I think this is what your example is saying: is a function symbol, so you are saying that is a term (by rule T3 in the attached photo). My intention was to accommodate such terms after "starting out" with being the union of constants and variables.
@Todd Trimble
Consider a function symbol f which is n-ary. The number of terms it can form can be expressed combinatorially - such as “m choose n”, where m is the number of terms we have so far, and n is the arity of f.
You're way off. The "m choose n" terminology is for something else: the number of choices of an n-element subset of an m-element set. For example, "n choose 2" = n(n-1)/2.
I am, but it's good to be wrong and then be corrected.
Let me see if I can iron out some details here.
This is the definition of function and relation symbols in my book:
I think this could be expressed as a set of ordered pairs, like this:
, where each set is a set of symbols.
I think that when you create terms, there are rules which depend on the numerical value in position 0 of each of the above ordered pairs:
(T3) If the strings are -terms and is an -ary function symbol in , then is also an -term.
This is where I need to speculate in order to grow. I think it is ok to say things that turn out not to be correct. Thinking exploratively is an important supplement to reading. I agree that my posts should be written as clearly as possible.
I want a clear understanding of a formal or set-theoretic definition of "applying a function to its arguments".
Maybe my familiarity with interpreting an expression like as a function is misleading me. T3 is not relevant to an interpretation of . T3 is what I would like to call a "syntactic function". If we think of strings of symbols as tuples, then I think T3 can be written like this:
From this perspective, T3 takes some symbols and returns a tuple of those symbols.
I will continue to explore these thoughts in my next message.
I meant to say that...
I thought possibly you meant something like that, but declarative sentences like the one I quoted are just asking to be evaluated as they stand. If you had said Terms include Constants and Variables, then I wouldn't have reacted. :-)
Sometimes when I've taught this stuff, I've used tree-like pictures to indicate the structure of terms, in which term formation proceeds from top down, toward the root of the tree. At the top of the tree (the "leaves"), the edges adjacent to them are labeled by variable symbols. Further down the tree, each internal node is labeled by a function symbol (like ), whose arity must match the number of "incoming edges" (coming from above). By the way, "in my book", constant symbols are just function symbols of arity . Given term-labels of the incoming edges at a node labeled , the rule is that the outgoing edge is labeled . The last edge to be labeled by this recursive procedure is the one adjacent to the root, and this is the term that the term-formation tree is trying to describe.
Actually, maybe I shouldn't be saying "trees", unless there are no repetitions of variables. Sometimes repetitions of variables are handled by splitting or bifurcating wires, and these are attached to the tops of the trees. I should probably draw an example by hand and attach a screen shot, but maybe you'll have gotten the gist anyway.
As an exercise to my class, I would have them draw out a term-formation tree in the propositional calculus; it could be for example . And then, to exemplify the splitting of wires, draw the "tree" for .
Anyway, trees are very useful for visualizing the application of recursive rules, and they are closely connected with initial algebras for endofunctors, especially for so-called polynomial endofunctors. The same type of tree mechanism applies not just to term-formation rules, but to sequent-formation rules, as I was trying to describe back here. So even though sequent calculus comes in layers, with term-formation rules being one of the first layers, tree-like figures provide a kind of conceptual unity between the basic structure of formation rules at various levels of describing deductive systems.
Thanks, I really appreciate your help, and I'll study this now, since I have free time today to study.
Julius Hamilton said:
(T3) If the strings are -terms and is an -ary function symbol in , then is also an -term.
This is where I need to speculate in order to grow. I think it is ok to say things that turn out not to be correct. Thinking exploratively is an important supplement to reading. I agree that my posts should be written as clearly as possible.
I want a clear understanding of a formal or set-theoretic definition of "applying a function to its arguments".
Maybe my familiarity with interpreting an expression like as a function is misleading me. T3 is not relevant to an interpretation of . T3 is what I would like to call a "syntactic function". If we think of strings of symbols as tuples, then I think T3 can be written like this:
From this perspective, T3 takes some symbols and returns a tuple of those symbols.
I will continue to explore these thoughts in my next message.
Isn’t it what I provided thereabove? :thinking:
Your message looks really great, I’ll respond to it soon.
I want a clear understanding of a formal or set-theoretic definition of "applying a function to its arguments".
Maybe my familiarity with interpreting an expression like f(t1,…,tn) as a function is misleading me. T3 is not relevant to an interpretation of f(t1,…,tn). T3 is what I would like to call a "syntactic function".
So it's true that "in the metalanguage", if is the set of function symbols of arity in the language at hand, and is the set of terms of the language at hand, the term-formation rules give functions
where I am using to denote the set of functions from a set to a set . This sounds like what you were after, no?
It could also be formulated as a function
(It's slightly more complicated if there is more than one sort, but many logic texts don't even discuss this possibility, so I won't either for now.)
I believe you understand exactly what I am after, but I'm going to get there through my own observations and use your messages as reference or support. If it seems like I'm ignoring something you wrote, it's because I'm going to come back to it when I feel I've worked my way up to it.
My text does not use the term "atomic terms", but I think this is conceptually valid. I think of atomic formulae as being "rules" (which I think of as functions) which generate formulae from terms. Like so:
By analogy, I would call "atomic terms" those terms which are not generated by functions from other terms. I think I can call such functions "recursive", "endofunctions", or maybe "operators".
This would make constants and variables atomic terms, and terms "generated" by T3 not atomic. I think non-atomic formulae are called "composite" in my text, so I could consider calling non-atomic terms "composite", unless I don't like how that conceptually portrays them.
It is much easier for me to get out many thoughts I have when they occur to me, even if it seems like a digression. I want to think for a moment about how functions are commonly stated like so:
There is a function such that
When a function is defined what I think is called "piece-wise" or I think could be called "arbitrarily", I think the set-theoretic definition as I think a total, well-defined relation seems fitting. It seems like we establish the function by stating its values as ordered pairs. I think we can think of this as a "selection" from the two sets and (the domain and codomain). By this, I mean that I think there seems like there is a good reason that even though the image of the function is not equal to the codomain, we still define the function in terms of a codomain set.
I mean to say that I have been thinking about that it sometimes seems odd to differentiate between a function's domain and codomain (), vs. the actual expression that determines its values (). It makes me wonder what might happen if for the same expression , we chose a larger or smaller codomain which still included the image of .
It makes me wonder if we could define functions solely as the result of a "computational process", without "declaring the domain and codomain" (which I think of as analogous to "declaring a type", in type theory).
If I observe that T3 is this function: , if I "unravel" this notation to the set-theoretic definition (I wonder if there is a phrase for the "reduced form" of a definition, where we substitute all definitions with their defining expression), the tuple is an element in a product.
Maybe that is a reason why we can't avoid choosing a domain and codomain - because a function is a set of ordered pairs, and ordered pairs are defined as elements of a product, and the arguments of the product are precisely the domain and codomain.
I want to think more about this.
I wonder, when I write this function, , do I immediately implicitly determine a domain and codomain? Maybe it sounds obvious, but thinking through this helps me feel like that function implies that it has this domain and codomain: . (One reason why I think this is because I wrote variable symbols such as with an assumption about what values those variables can take, and if I state that , etc., then I think the domain and codomain are determined.)
This leads me to an interesting observation that I think was one of the things that confused me earlier in trying to define this abstractly. It now looks like T3 is just the identity function, because I think we can define the inputs to a function as a tuple as well. I wonder if this is correct or not.
This would make the expression that generates terms via application of function symbols very simple. I think it would be:
, where for and , .
(I am ignoring taking function arity into account right now, and using the -fold product notation, to keep this simple right now.)
Now, I think I need a way to express the set of all elements "generated" by repeated applications of such a function.
It occurs to me that this could probably be a categorical diagram that looks something like this:
This is very rough, so please excuse the initial speculative nature of these thoughts. But I can imagine that this diagram lets us look at the set of all terms from a different perspective. If I approach defining in a way that I think could be called "constructively", I begin with a set of known terms, and use functions to generate more.
In a categorical diagram, it feels like I assume the existence of a set of all terms, and I am actually considering that for any subset of , the function (or a variant of it restricted to subsets?), is defined.
The intuition here is the difference building "building something up" vs. considering something already built up, in a way.
(I also want to consider adding categorical products to such a diagram.)
(Working up to a better or more accurate version of the diagram: termsdiagram2.png)
(It would be cool if there were a way in a diagram to show how Functions and a natural number form a product, which gets mapped to an n-fold product of terms...)
I'm having lots of ideas and questions here, here is another sketch of how a diagram could start out:
I can imagine that a "signature" (at least the function part of it) is a subset of N x Functions. It might have a certain condition like an injective embedding. I'll think more about this.
Sorry if there are mistakes in the following but the following idea excites me.
A relation on and is a subset of .
I am pretty sure in the category of sets, a subset has an injective map into defined by for .
In , I think injective maps act as monomorphisms. I think the definition of a monomorphism can be paraphrased as “a morphism which composes uniquely (on the left)”. That means that if , then . I want to think more about this, but conceptually, I think of injective functions as “preserving information so they are uniquely identifiable”.
This makes me think that whenever there is a monomorphism into an object , in , that object is isomorphic to a subset of .
I wonder if this means that a corresponding idea to a subset, in any category, is a monomorphism.
If so, then I think of the collection of all monomorphisms into an object as analogous or isomorphic to the power set of a set. I am still thinking through if this is correct or not.
I think that this construction is isomorphic to its dual, which is the collection of all epimorphisms out of the object. I think this reminds me of seeing somewhere that cardinality can be defined either injectively or surjectively. I can’t remember it right now.
I am thinking about how a Hom-set is the set of all functions between two objects A and B in , which I think corresponds to an exponential object . So I wonder if we can find interesting correspondences between a classification of into epimorphisms, monomorphisms, both, or neither, and something to do with .
I think that a relation on and can be seen categorically as a monomorphism into . I wonder how to express a function in a similar way. If a function is left-total (I think it is), I think it should be isomorphic to . If a function is right-unique (I think it is)… I think I was trying to think of a condition relating to , for example, it must be surjective but not injective, I think. But I think a better condition might be that there must exist an injection from the image of in to the preimage of in ???
My idea is to add that the set of function symbols with their arities is a surjective function from to disjoint sets of function symbols, or, a function (not necessarily injective or surjective) from the set of function symbols to , and show this in my diagram.
My longterm idea is to find a function/functions which can generate any finite number of terms from an alphabet, then any finite number of well-formed formulas from those terms, and then any finite number of collections of those formulas. Each of those collections represents an axiom system or theory. Then, I want to find a function which generates any finite number of valid deductions in that theory, essentially proving theorems. Then, I want to find a function that evaluates the metalogical properties of each theory, so that we can see which theories have which properties.
This is a conversation I had with an AI which I found gave me new ideas, even if the AI is incorrect.
convowithclaudeaboutsubobjects.pdf
Mike Stay said:
Often you can think of the horizontal line as a natural or dinatural transformation between profunctors. It is dinatural if a symbol appears both on the left and on the right of the turnstyle, natural otherwise.
For example,
denotes a natural transformation
where
and
And for example,
denotes a dinatural transformation
where
and
I’d like to understand this now.
Are and formulae, which are objects in a category? Is a collection of formulae? If so, should it be defined as a product of objects?
denotes a natural transformation
I don’t understand this. A product rule of sequent calculus is being treated as a natural transformation between two functors? Does this mean the sequents are considered functors? In my book, you can have multiple sequents above the line, but I think the order matters. Would that be a product of functors, in your approach? Why does appear to become ?
where and
takes the pullback of , the set of morphisms between and , which is considered a sequent, and , and maps it to ?
If maps sequents to Hom-sets, what is the interpretation of a Hom-set having multiple arrows? Does represent multiple ways the succedent can be a consequence of the antecedent?
Thanks very much.
So many different topics, but regarding the different levels of implication:
John Baez said:
I remember first learning sequent calculus and being amused and somewhat appalled at how there are three levels of 'implication' at work: we can write formulae like
and also sequents like
and also inferences, or maybe they're called deductions, like
the first two ( I know them as implication and entailment) are united with a deduction theorem https://planetmath.org/DeductionTheorem
The third one also looks like the same thing.
The pedagogy of using three symbols is valuable --- helps you understand what's happening --- but at the same time you might find them confusing, as I still do. Are they really the exactly the same thing?
They're three different (but related) things!
The symbol $A \to B$ lives inside the logic. It's a statement the logic can or can't prove, like any other. A priori, it doesn't have any "real" mathematical meaning.
The symbol $A \vdash B$ lives outside the logic. It's a statement about "relative truth" that we can only see from the outside. Saying that $A \vdash B$ is saying that "inside the logic, $B$ is at least as true as $A$". A priori it doesn't tell you anything about a "derivation". It still lives inside a proof system, in much the same way that $A \to B$ lives inside the object language.
The symbol $\frac{A \vdash B}{C \vdash D}$ is a statement about a proof system. It says "if you're able to derive $A \vdash B$, then you're able to derive $C \vdash D$.
So we have three "levels" that we work at. The innermost "object language", the middling level of "sequents", which tell us facts about the object language, and the outermost level of "derivations", which tell us how to derive these sequents!
Of course, all these levels are related by these "deduction theorems", which can be confusing.
I'll say more in a minute, I'm getting off a plane
The deduction theorem says that $A \vdash B$ at the middle level holds if and only if $A \to B$ holds in the object language. So "$B$ is at least as true as $A$" if and only if $A \to B$ is true on the nose! So the deduction theorem is really a way to get the object language to know a bit more about itself. Suddenly you can express facts about truth relationships (which naturally live at the level of sequents) in the object language itself! And this self reference is useful.
You can think of the soundness/completeness theorems as playing a similar role to the deduction theorem, but one level up. It says a relationship $A \vdash B$ is derivable at the top level of and only if this relationship holds in all models!
Just like the deduction theorem lets the object language talk about stuff happening one level higher, we see soundness/completeness lets you talk about derivations at the level of sequents (as long as the sequent holds "uniformly" in the choice of model)
Jencel wrote:
The pedagogy of using three symbols is valuable --- helps you understand what's happening --- but at the same time you might find them confusing, as I still do. Are they really the exactly the same thing?
No, they are completely different things. And they have anti-pedagogical value - as I said, I was completely confused when I first met them. So you need to work to understand how they play different roles. Luckily @Chris Grossack (they/them) has given an intuitive explanation. But don't be fooled by hand-wavy terms like "at least as true as" or "see from the outside": those are just useful heuristics. The hard core is how the 3 different levels of implication are actually used.
As with most math, the best way to start understanding is to get your hands dirty reading papers and doing computations!
Eventually the differences between the levels starts to sink in through experience ^_^
I think it helps to understand how these are interpreted semantically. Say we are using propositional intuitionistic logic. Then a model is a Heyting algebra L and an assignment of propositions to elements of L satisfying the axioms.
A formula is an element of the Heyting algebra. If then this means the element corresponding to the formula is less than or equal to the one corresponding to .
You can think of the derivations as saying that if a model satisfies the inequality given by the sequent at the top, it always satisfies the inequality given by the sequent at the bottom. (Here I'm assuming soundness, but if everything is working as intended this will hold.)
@John Baez My question was prompted by this sentence you wrote:
And if you used Hilbert-style deduction instead of natural deduction you could strip off the middle one. You need one level just to express the idea of deduction rules.
Now I undestand that by "one level" you probably don't mean one level as a whole?
@Chris Grossack (they/them) Thanks for the explanation.
Otherwise, I am familiar with the categorical/algebraic interpretation which @Graham Manuell put forward. In fact, this was what prompted my suspicion that the three levels are the same --- I suspected that they all can be morphisms of some kind, though I know that is an object in a given Heyting algebra, is a morphism and the fraction symbol expresses rules for creating objects and morphisms.
P.S. Every time I think about the connection between CT and types I look at this very helpful table that I found in the "Product type" page in nlab
image.png
Unfortunately the paper from where it is taken is on a very different topic and has no more such graphics.
I don't understand why you think they are the same.
I always had the following in my head as a rough explanation for the three levels:
symbol | interpretation |
---|---|
morphisms in some category (e.g. a Heyting algebra) | |
internal hom in said category | |
horizontal line | equations between morphisms in said category |
If that's right then it doesn't seem so mysterious, since we work with those three things all the time in category theory, and in that context they're just three different things - they don't really feel like "levels" of the same thing. (Well the first and third sort of are, but the second is just something else.)
But the graphic above is making me think the third line in the table might be a bit more subtle than that?
@Graham Manuell I don't think, I asked if they are after reading the comment I quoted above.
I would say that the deduction rules are related to some constructions on morphisms. For example, the deduction rule for products
is exactly interpreted in a cartesian category as the pairing operation, which is a (set-theoretic!) function between hom-sets
Then, the equations between morphisms on the categorical side would be related to the cut-elimination procedure on the logical side, or at least -- in the more general case where termination is not guaranteed -- to some equivalence relation defined on derivations.
@Nathaniel Virgo. The horizontal line is more... capable let's say. There are 4 types of rules (see the text on the left-hand-side of the graphic): type formation, introduction, elimination and computation.
The computation rules correspond to equations.
The rest of the rules, as @Vincent Moreau noted, define constructions of morphisms (and of objects, in the case of type formation rules).
This makes some sense. It feels like if I tried to define a "free category with products" on some set of objects, then I'd end up specifying a bunch of functions along the lines of the one @Vincent Moreau posted, and then the horizontal lines would be notation for those.
Yes! And by the way, what you describe is a special case of the free category with products on some category. One way to achieve its dual, the coproduct completion, is the so-called "family construction".
It is described on the free coproduct completion nlab page.
Regarding the finite product completion of a set (= discrete category), it can be seen as a category of renamings, see for instance definition 4 of Semantic Analysis of Normalisation by Evaluation for Typed Lambda Calculus, where is any set and is the category of finite sets and functions between them.
Vincent Moreau said:
I would say that the deduction rules are related to some constructions on morphisms.
I agree. I think it's good to understand these 'constructions' as precisely as we can using category theory. Earlier in this thread @Mike Stay argued that these 'constructions' are typically natural transformations, or more generally [[dinatural transformations]].
Mike Stay said:
Often you can think of the horizontal line as a natural or dinatural transformation between profunctors. It is dinatural if a symbol appears both on the left and on the right of the turnstyle, natural otherwise.
For example,
denotes a natural transformation
where
and
And for example,
denotes a dinatural transformation
where
and