You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
In the spirit of some recent threads, I have been attempting to write a proof for an exercise in "A Mathematical Introduction to Logic" by Enderton, and I was hoping for some feedback on the proof structure I would like to use. I'm also hoping this thread can be of interest to other people hoping to learn how to prove things.
In this exercise, I find myself wanting to use a form of induction a bit more general than what I've used before, and I'm not sure if it's valid.
The exercise in question states:
"Let α be a wff; let c be the number of places at which binary connective symbols (∧, ∨, →, ↔) occur in α; let s be the number of places at which sentence symbols occur in α. (For example, if α is (A → (¬ A)) then c = 1 and s = 2.) Show by using the induction principle that s = c + 1."
Here a "wff" refers to a "well-formed formula", which is created in one of the following ways:
Now, the usual form of an induction argument I'm familiar with goes roughly like this:
However, the induction I'd like to do in this case has a bit of a different structure:
The second step described here (a sort of inductive step) is straightforward to carry out. What I'm more concerned with here is the basic structure of the attempted proof involved. It feels like I'm trying to generalize the idea from normal induction where we make use of repeated applications of "add one" to generate all natural numbers of interest. Is there a more general form of induction where we talk about using a collection of operations which together generate all the objects of interest, from a set of "base case" objects?
That is, is the following argument structure valid?
Any thoughts around this topic are welcome!
The structure of a proof based on induction over the natural numbers should always be the same: first, you write a predicate over Nat, say P, and then show that P(0) holds, and then that P(n) implies P(n+1) (or P(m) for all m <= n implies P(n), for strong induction). Is it possible you are trying to do induction over the syntax of wffs, where you first write a predicate over wffs, say, Q, and then show that Q holds of all atomic propositions, and then that Q(x*y) holds assuming Q(x) and Q(y), hold for each connective *. It's also possible to do induction over proofs of wffs, any many things besides. In any case I'd encourage you to write the inductive hypothesis explicitly.
I would call this more general form of induction structural induction.
It feels like I'm trying to generalize the idea from normal induction where we make use of repeated applications of "add one" to generate all natural numbers of interest. Is there a more general form of induction where we talk about using a collection of operations which together generate all the objects of interest, from a set of "base case" objects?
Yes, the Wikipedia article on "structural induction" discusses this kind of induction. Near the end they get into the technical details of when and why structural induction works. The set of structures you're doing induction on need to form a 'well-founded partial order'.
If you don't know what a well-founded partial order is and when you've got one, you can either 1) learn this stuff, or 2) fake it (this is probably rather common and you can probably get away with it by muttering words like "I will do structural induction on the set of wffs"), or 3) use old-fashioned induction on natural numbers, by assigning to each wff (or whatever you're doing) a natural number describing its complexity, in such a way that a wff of complexity n+1 can only be obtained in various specified ways (which you list) from a wff of complexity n.
Also see Mike Shulman's remark on induction. "Strong" induction on natural numbers is where you get to assume for all in order to prove .
(It's a good exercise to use ordinary "weak" induction on natural numbers to prove strong induction.)
Well-founded partial orders are one way to be precise about structural induction, but not the only way. For instance, your sketch:
- We have a statement which depends on . We want to show is true for all .
- Begin by showing is true for all , with .
- Introduce a number of operations of potentially varying arity that "generate" from . That is, given any element of , by applying our operations a finite number of times starting with elements in , we can produce the specified element
- Show that, for each , the output of always satisfies , provided that each of the input elements satisfy
- Conclude that is true for each
can be justifed directly without talking about well-foundedness explicitly. Define a subset of to be "-closed" if whenever the inputs of some belong to , so does the output. Your assumption that generates under means that if a -closed subset contains , then it must be all of . (If you like, this is by ordinary natural-number induction on the finite number of -operations you have to apply to get from to some element .) Your two proofs show that the set contains and is -closed; therefore it is all of .
Another way to justify structural induction is to talk about general "inductive definitions", which categorically are [[initial]] [[algebras for an endofunctor]]. In the case of logical formulas (I dislike the term "wff" because no one ever wants to talk about an "ill-formed" formula), we are talking about the endofunctor of defined by
where is the set of sentence symbols. An -algebra is a set with a function , which means that we have an interpretation of every sentence symbol as an element of , together with one unary operation corresponding to negation, and four binary operations corresponding to disjunction, conjunction, implication, and equivalence. The set of formulas is the initial such -algebra, which means that its elements are "freely generated" by applying these operations formally. This is a categorical way to express that "Every wff is constructed from sentence symbols by application of a finite number of formula-building operations."
Given this definition, we can justify structural induction as follows. Suppose is some property of formulas, and consider . The "base case" and "inductive steps" show that is itself an -algebra such that the inclusion is an -algebra homomorphism. But is the initial -algebra, so there is also an -algebra homomorphism , and uniqueness means that the composite is the identity. Therefore, , i.e. holds for all .
Jacob Zelko is just beginning his studies of pure math, learning how to prove things, so initial algebras of endofunctors may be too much for him - and definitely for the faculty who will be grading his homework!
I don't see Jacob Zelko in this thread?
Awesome! Looks like there's a lot of great responses here! Thanks everyone. Once I spend some time absorbing what's here in a bit more detail, I'll write up some thoughts in response.
Okay, @Mike Shulman. I'm mixed up about people, as usual. Jakob has been looking for advice on how to prove things, and I thought this was a continuation of that. I was recently explaining the basics of first-order logic to David Egolf, like what's a "predicate", so initial algebras of endofunctors might be a bit heavy-duty for him, but what the heck.... he's resilient, he'll bounce back.
Sure, I can see that it might be heavy for some readers. But I can also imagine a reader who knows some category theory but not so much set theory or formal logic, for whom initial algebras of endofunctors might be easier to understand than well-founded partial orders. I don't know anything about David's background.
I've been talking with him a lot here so I have detailed model of his internal state. :upside_down: But yes, even if he isn't thrilled by "induction via initial algebras" (and he actually might be), someone else will.
I just noticed that Clive Newstead's intro-to-proofs textbook infinite descent includes a section at the very end (12.2) on "inductively defined sets". It looks pretty good and pretty accessible.
Mike Shulman said:
I just noticed that Clive Newstead's intro-to-proofs textbook infinite descent includes a section at the very end (12.2) on "inductively defined sets". It looks pretty good and pretty accessible.
Oh, that does look good! In addition to the section you note, I am also drawn to the simple clear definition of a "predicate". (Ahh, the struggle of having access to more interesting books than the time and energy to read them!)
John Baez said:
3) use old-fashioned induction on natural numbers, by assigning to each wff (or whatever you're doing) a natural number describing its complexity, in such a way that a wff of complexity n+1 can only be obtained in various specified ways (which you list) from a wff of complexity n.
I thought it would be interesting to try to complete the above exercise using a few of the different approaches suggested above, starting with the "weak induction principle". The trick here is assigning a complexity natural number to wffs so that a wff complexity can be made in a simple way from wffs with complexity .
For example, consider the wff . This is formed by apply the formula-building operation to and . Since both of these wffs are used as ingredients to make a wff with some complexity , I'd like to assign them each complexity . However, seems like it should probably be assigned a higher complexity number than .
In general, it seems more natural to instead ask that we build a wff of complexity from formulas of complexity or less (instead of strictly those of complexity ). I believe that will corresponds to using strong induction, and I'll try that in a minute.
Here's the principle of strong induction as stated in "An Infinite Descent into Pure Mathematics":
strong induction
We set to be: "A wff constructed by application of formula-building operations satisfies ."
is: A wff constructed by application of 0 formula-building operations satisfies . This is true because and for our sentence symbols, which are the only wffs formed by applying zero formula-building operations.
For arbitrary , we now assume that is true for all .
We want to show that is true under this assumption.
is: A wff constructed by application of formula-building operations satisfies .
Let be a wff constructed by application of formula-building operations.
Then is of one of the following forms:
(¬ α), (α ∧ β), (α ∨ β), (α → β)
where and are wffs.
In each case, is constructed from wffs constructed by application of or fewer formula-building operations.
In the first case, . We know by our assumption.
(Here denotes for the wff , and denotes for ).
So, and and .
In the second case , we know and by our assumption.
So, and . That gives as desired.
The remaining two cases are similar to this one.
We conclude is true for arbitrary , assuming that is true for all .
By the strong induction principle, is true for all .
This approach seems like it's a good match for this exercise, as we can combine wffs of varying complexity to make a wff of a given complexity.
Whew. Time for me to take a break now!
It is perhaps worth noting that this proof by strong induction of a statement about wffs can be turned into a proof of the statement about graphs that Jacob Zelko started the Fantastic Proofs thread with. Just substitute "path" for "wff", "vertex" for "sentence symbol", and "edge" for "connective symbol" (and ignore negation altogether).
Well, technically we also have to start at n=1 instead of n=0 to match up with the definition of path given there (though I would argue for including the degenerate path with one vertex in the set of paths).
Hi folks! Been busy traveling and wrapping up a paper submission (:partying_face:)! Was catching up and saw my name pop up a bit! Feel free to always ping me if you want. :smile:
Regarding induction, I’m still in the process of gaining basic understanding to talk about wffs. Right now, working through Velleman and truth tables as we speak!!
Mike Shulman said:
Another way to justify structural induction is to talk about general "inductive definitions", which categorically are [[initial]] [[algebras for an endofunctor]]. In the case of logical formulas (I dislike the term "wff" because no one ever wants to talk about an "ill-formed" formula), we are talking about the endofunctor of defined by
where is the set of sentence symbols. An -algebra is a set with a function , which means that we have an interpretation of every sentence symbol as an element of , together with one unary operation corresponding to negation, and four binary operations corresponding to disjunction, conjunction, implication, and equivalence. The set of formulas is the initial such -algebra, which means that its elements are "freely generated" by applying these operations formally. This is a categorical way to express that "Every wff is constructed from sentence symbols by application of a finite number of formula-building operations."
Let me try and understand this a little. We have our endofunctor which sends each set to the set . Then, for a choice of , a function defines several functions "side by side": , , , , and an extra function (relative to the operations I discussed above) .
I think we are working towards providing data that describes how to make new logical formulas from old ones, or from sentence symbols. It's cool to see how an -algebra describes a set of "building blocks", together with a collection of operations to combine them. I suppose this idea could be generalized to other categories too, with products and coproducts (?).
Now, there are lots of choices we could make for . However, we want to pick in some "free" way, if I understand correctly, so that becomes a set of logical formulas. I think this corresponds to putting the symbols in , and then adding a new formula for the output of each our unary and binary operators - and we require the outputs of each of these to be distinct, for distinct inputs. I guess this should yield the logical formulas we want.
To understand the rest, I think I'd need to try some examples with the definition of a morphism between -algebras. I don't yet really understand what it means to have a morphism between two -algebras. (Although I see the definition on the nlab page you linked, it will take me some absorbing/practice with the idea).
This is a great tutorial, but it also covers coinduction at the same time. Maybe you can try reading only Section 5.
In this case, a morphism of -algebras is a function that preserves the map from , the negation unary operation, and the four binary operations. That is, if is a morphism of -algebras, we have , , , etc.