Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: deprecated: mathematics

Topic: proving things: induction


view this post on Zulip David Egolf (Apr 03 2023 at 15:34):

In the spirit of some recent threads, I have been attempting to write a proof for an exercise in "A Mathematical Introduction to Logic" by Enderton, and I was hoping for some feedback on the proof structure I would like to use. I'm also hoping this thread can be of interest to other people hoping to learn how to prove things.

In this exercise, I find myself wanting to use a form of induction a bit more general than what I've used before, and I'm not sure if it's valid.

The exercise in question states:
"Let α be a wff; let c be the number of places at which binary connective symbols (∧, ∨, →, ↔) occur in α; let s be the number of places at which sentence symbols occur in α. (For example, if α is (A → (¬ A)) then c = 1 and s = 2.) Show by using the induction principle that s = c + 1."

Here a "wff" refers to a "well-formed formula", which is created in one of the following ways:

Now, the usual form of an induction argument I'm familiar with goes roughly like this:

However, the induction I'd like to do in this case has a bit of a different structure:

The second step described here (a sort of inductive step) is straightforward to carry out. What I'm more concerned with here is the basic structure of the attempted proof involved. It feels like I'm trying to generalize the idea from normal induction where we make use of repeated applications of "add one" to generate all natural numbers of interest. Is there a more general form of induction where we talk about using a collection of operations which together generate all the objects of interest, from a set of "base case" objects?

That is, is the following argument structure valid?

Any thoughts around this topic are welcome!

view this post on Zulip Ryan Wisnesky (Apr 03 2023 at 15:47):

The structure of a proof based on induction over the natural numbers should always be the same: first, you write a predicate over Nat, say P, and then show that P(0) holds, and then that P(n) implies P(n+1) (or P(m) for all m <= n implies P(n), for strong induction). Is it possible you are trying to do induction over the syntax of wffs, where you first write a predicate over wffs, say, Q, and then show that Q holds of all atomic propositions, and then that Q(x*y) holds assuming Q(x) and Q(y), hold for each connective *. It's also possible to do induction over proofs of wffs, any many things besides. In any case I'd encourage you to write the inductive hypothesis explicitly.

view this post on Zulip Ralph Sarkis (Apr 03 2023 at 15:49):

I would call this more general form of induction structural induction.

view this post on Zulip John Baez (Apr 03 2023 at 15:55):

It feels like I'm trying to generalize the idea from normal induction where we make use of repeated applications of "add one" to generate all natural numbers of interest. Is there a more general form of induction where we talk about using a collection of operations which together generate all the objects of interest, from a set of "base case" objects?

Yes, the Wikipedia article on "structural induction" discusses this kind of induction. Near the end they get into the technical details of when and why structural induction works. The set of structures you're doing induction on need to form a 'well-founded partial order'.

view this post on Zulip John Baez (Apr 03 2023 at 15:58):

If you don't know what a well-founded partial order is and when you've got one, you can either 1) learn this stuff, or 2) fake it (this is probably rather common and you can probably get away with it by muttering words like "I will do structural induction on the set of wffs"), or 3) use old-fashioned induction on natural numbers, by assigning to each wff (or whatever you're doing) a natural number describing its complexity, in such a way that a wff of complexity n+1 can only be obtained in various specified ways (which you list) from a wff of complexity n.

view this post on Zulip John Baez (Apr 03 2023 at 16:02):

Also see Mike Shulman's remark on induction. "Strong" induction on natural numbers is where you get to assume P(k)P(k) for all knk \le n in order to prove P(n+1)P(n+1).

view this post on Zulip John Baez (Apr 03 2023 at 16:03):

(It's a good exercise to use ordinary "weak" induction on natural numbers to prove strong induction.)

view this post on Zulip Mike Shulman (Apr 03 2023 at 20:32):

Well-founded partial orders are one way to be precise about structural induction, but not the only way. For instance, your sketch:

can be justifed directly without talking about well-foundedness explicitly. Define a subset ZZ of XX to be "bb-closed" if whenever the inputs of some bib_i belong to ZZ, so does the output. Your assumption that YY generates XX under bb means that if a bb-closed subset contains YY, then it must be all of XX. (If you like, this is by ordinary natural-number induction on the finite number of bb-operations you have to apply to get from YY to some element xXx\in X.) Your two proofs show that the set {xXP(x)}\{x \in X \mid P(x)\} contains YY and is bb-closed; therefore it is all of XX.

view this post on Zulip Mike Shulman (Apr 03 2023 at 20:33):

Another way to justify structural induction is to talk about general "inductive definitions", which categorically are [[initial]] [[algebras for an endofunctor]]. In the case of logical formulas (I dislike the term "wff" because no one ever wants to talk about an "ill-formed" formula), we are talking about the endofunctor of Set\rm Set defined by
F(x)=S+X+(X×X)+(X×X)+(X×X)+(X×X) F(x) = S + X + (X\times X) + (X\times X) + (X\times X) + (X\times X)
where SS is the set of sentence symbols. An FF-algebra is a set XX with a function F(X)XF(X) \to X, which means that we have an interpretation of every sentence symbol as an element of XX, together with one unary operation XXX\to X corresponding to negation, and four binary operations X×XXX\times X \to X corresponding to disjunction, conjunction, implication, and equivalence. The set WW of formulas is the initial such FF-algebra, which means that its elements are "freely generated" by applying these operations formally. This is a categorical way to express that "Every wff is constructed from sentence symbols by application of a finite number of formula-building operations."

Given this definition, we can justify structural induction as follows. Suppose P(x)P(x) is some property of formulas, and consider U={xWP(x)}U = \{ x\in W \mid P(x)\}. The "base case" and "inductive steps" show that UU is itself an FF-algebra such that the inclusion UWU \hookrightarrow W is an FF-algebra homomorphism. But WW is the initial FF-algebra, so there is also an FF-algebra homomorphism WUW\to U, and uniqueness means that the composite WUWW\to U \hookrightarrow W is the identity. Therefore, U=WU = W, i.e. P(x)P(x) holds for all xWx\in W.

view this post on Zulip John Baez (Apr 03 2023 at 21:17):

Jacob Zelko is just beginning his studies of pure math, learning how to prove things, so initial algebras of endofunctors may be too much for him - and definitely for the faculty who will be grading his homework!

view this post on Zulip Mike Shulman (Apr 03 2023 at 21:18):

I don't see Jacob Zelko in this thread?

view this post on Zulip David Egolf (Apr 03 2023 at 23:05):

Awesome! Looks like there's a lot of great responses here! Thanks everyone. Once I spend some time absorbing what's here in a bit more detail, I'll write up some thoughts in response.

view this post on Zulip John Baez (Apr 03 2023 at 23:36):

Okay, @Mike Shulman. I'm mixed up about people, as usual. Jakob has been looking for advice on how to prove things, and I thought this was a continuation of that. I was recently explaining the basics of first-order logic to David Egolf, like what's a "predicate", so initial algebras of endofunctors might be a bit heavy-duty for him, but what the heck.... he's resilient, he'll bounce back.

view this post on Zulip Mike Shulman (Apr 03 2023 at 23:44):

Sure, I can see that it might be heavy for some readers. But I can also imagine a reader who knows some category theory but not so much set theory or formal logic, for whom initial algebras of endofunctors might be easier to understand than well-founded partial orders. I don't know anything about David's background.

view this post on Zulip John Baez (Apr 03 2023 at 23:47):

I've been talking with him a lot here so I have detailed model of his internal state. :upside_down: But yes, even if he isn't thrilled by "induction via initial algebras" (and he actually might be), someone else will.

view this post on Zulip Mike Shulman (Apr 04 2023 at 17:33):

I just noticed that Clive Newstead's intro-to-proofs textbook infinite descent includes a section at the very end (12.2) on "inductively defined sets". It looks pretty good and pretty accessible.

view this post on Zulip David Egolf (Apr 04 2023 at 18:04):

Mike Shulman said:

I just noticed that Clive Newstead's intro-to-proofs textbook infinite descent includes a section at the very end (12.2) on "inductively defined sets". It looks pretty good and pretty accessible.

Oh, that does look good! In addition to the section you note, I am also drawn to the simple clear definition of a "predicate". (Ahh, the struggle of having access to more interesting books than the time and energy to read them!)

view this post on Zulip David Egolf (Apr 04 2023 at 20:49):

John Baez said:

3) use old-fashioned induction on natural numbers, by assigning to each wff (or whatever you're doing) a natural number describing its complexity, in such a way that a wff of complexity n+1 can only be obtained in various specified ways (which you list) from a wff of complexity n.

I thought it would be interesting to try to complete the above exercise using a few of the different approaches suggested above, starting with the "weak induction principle". The trick here is assigning a complexity natural number to wffs so that a wff complexity n+1n+1 can be made in a simple way from wffs with complexity nn.

For example, consider the wff (A1A2)A3(A1 \vee A2) \vee A3. This is formed by apply the \vee formula-building operation to (A1A2)(A1 \vee A2) and A3A3. Since both of these wffs are used as ingredients to make a wff with some complexity n+1n+1, I'd like to assign them each complexity nn. However, A1A2A1 \vee A2 seems like it should probably be assigned a higher complexity number than A3A3.

In general, it seems more natural to instead ask that we build a wff of complexity n+1n+1 from formulas of complexity nn or less (instead of strictly those of complexity nn). I believe that will corresponds to using strong induction, and I'll try that in a minute.

view this post on Zulip David Egolf (Apr 04 2023 at 21:18):

Here's the principle of strong induction as stated in "An Infinite Descent into Pure Mathematics":
strong induction

We set p(n)p(n) to be: "A wff constructed by application of nn formula-building operations satisfies s=c+1s=c+1."
p(0)p(0) is: A wff constructed by application of 0 formula-building operations satisfies s=c+1s=c+1. This is true because s=1s=1 and c=0c=0 for our sentence symbols, which are the only wffs formed by applying zero formula-building operations.

For arbitrary n0n \geq 0, we now assume that p(k)p(k) is true for all 0kn0 \leq k \leq n.
We want to show that p(n+1)p(n+1) is true under this assumption.
p(n+1)p(n+1) is: A wff constructed by application of n+1n+1 formula-building operations satisfies s=c+1s=c+1.
Let γ\gamma be a wff constructed by application of n+1n+1 formula-building operations.
Then γ\gamma is of one of the following forms:
(¬ α), (α ∧ β), (α ∨ β), (α → β)
where α\alpha and β\beta are wffs.
In each case, γ\gamma is constructed from wffs constructed by application of nn or fewer formula-building operations.
In the first case, γ=(¬α)\gamma = (\neg \alpha). We know αs=αc+1\alpha_s = \alpha_c + 1 by our assumption.
(Here αs\alpha_s denotes ss for the wff α\alpha, and αc\alpha_c denotes cc for α\alpha).
So, γs=αs\gamma_s = \alpha_s and γc=αc\gamma_c = \alpha_c and γs=γc+1\gamma_s = \gamma_c + 1.
In the second case γ=αβ\gamma = \alpha \wedge \beta, we know αs=αc+1\alpha_s = \alpha_c + 1 and βs=βc+1\beta_s = \beta_c + 1 by our assumption.
So, γs=αs+βs=αc+βc+2\gamma_s = \alpha_s + \beta_s = \alpha_c + \beta_c + 2 and γc=αc+βc+1\gamma_c = \alpha_c + \beta_c + 1. That gives γs=γc+1\gamma_s = \gamma_c + 1 as desired.
The remaining two cases are similar to this one.
We conclude p(n+1)p(n+1) is true for arbitrary n0n \geq 0, assuming that p(k)p(k) is true for all 0kn0 \leq k \leq n.

By the strong induction principle, p(n)p(n) is true for all n0n \geq 0.

This approach seems like it's a good match for this exercise, as we can combine wffs of varying complexity to make a wff of a given complexity.

view this post on Zulip David Egolf (Apr 04 2023 at 21:20):

Whew. Time for me to take a break now!

view this post on Zulip Jason Erbele (Apr 05 2023 at 20:05):

It is perhaps worth noting that this proof by strong induction of a statement about wffs can be turned into a proof of the statement about graphs that Jacob Zelko started the Fantastic Proofs thread with. Just substitute "path" for "wff", "vertex" for "sentence symbol", and "edge" for "connective symbol" (and ignore negation altogether).

Well, technically we also have to start at n=1 instead of n=0 to match up with the definition of path given there (though I would argue for including the degenerate path with one vertex in the set of paths).

view this post on Zulip Jacob Zelko (Apr 08 2023 at 00:14):

Hi folks! Been busy traveling and wrapping up a paper submission (:partying_face:)! Was catching up and saw my name pop up a bit! Feel free to always ping me if you want. :smile:

Regarding induction, I’m still in the process of gaining basic understanding to talk about wffs. Right now, working through Velleman and truth tables as we speak!!

view this post on Zulip David Egolf (Apr 20 2023 at 15:16):

Mike Shulman said:

Another way to justify structural induction is to talk about general "inductive definitions", which categorically are [[initial]] [[algebras for an endofunctor]]. In the case of logical formulas (I dislike the term "wff" because no one ever wants to talk about an "ill-formed" formula), we are talking about the endofunctor of Set\rm Set defined by
F(x)=S+X+(X×X)+(X×X)+(X×X)+(X×X) F(x) = S + X + (X\times X) + (X\times X) + (X\times X) + (X\times X)
where SS is the set of sentence symbols. An FF-algebra is a set XX with a function F(X)XF(X) \to X, which means that we have an interpretation of every sentence symbol as an element of XX, together with one unary operation XXX\to X corresponding to negation, and four binary operations X×XXX\times X \to X corresponding to disjunction, conjunction, implication, and equivalence. The set WW of formulas is the initial such FF-algebra, which means that its elements are "freely generated" by applying these operations formally. This is a categorical way to express that "Every wff is constructed from sentence symbols by application of a finite number of formula-building operations."

Let me try and understand this a little. We have our endofunctor F:SetSetF: \rm Set \to \rm Set which sends each set XX to the set S+X+(X×X)+(X×X)+(X×X)+(X×X)S + X + (X\times X) + (X\times X) + (X\times X) + (X\times X). Then, for a choice of XX, a function F(X)XF(X) \to X defines several functions "side by side": FS:SXF_S: S \to X, F¬:XXF_{\neg}: X \to X, F:X×XXF_{\wedge}: X \times X \to X, F:X×XXF_{\vee}: X \times X \to X, F    :X×XXF_{\implies}: X \times X \to X and an extra function (relative to the operations I discussed above) F:X×XXF_\equiv: X \times X \to X.

I think we are working towards providing data that describes how to make new logical formulas from old ones, or from sentence symbols. It's cool to see how an FF-algebra describes a set of "building blocks", together with a collection of operations to combine them. I suppose this idea could be generalized to other categories too, with products and coproducts (?).

Now, there are lots of choices we could make for XX. However, we want to pick XX in some "free" way, if I understand correctly, so that XX becomes a set of logical formulas. I think this corresponds to putting the symbols SS in XX, and then adding a new formula for the output of each our unary and binary operators - and we require the outputs of each of these to be distinct, for distinct inputs. I guess this should yield the logical formulas we want.

view this post on Zulip David Egolf (Apr 20 2023 at 15:28):

To understand the rest, I think I'd need to try some examples with the definition of a morphism between FF-algebras. I don't yet really understand what it means to have a morphism between two FF-algebras. (Although I see the definition on the nlab page you linked, it will take me some absorbing/practice with the idea).

view this post on Zulip Ralph Sarkis (Apr 20 2023 at 15:33):

This is a great tutorial, but it also covers coinduction at the same time. Maybe you can try reading only Section 5.

view this post on Zulip Mike Shulman (Apr 20 2023 at 16:29):

In this case, a morphism of FF-algebras is a function that preserves the map from SS, the negation unary operation, and the four binary operations. That is, if ϕ:XY\phi:X\to Y is a morphism of FF-algebras, we have ϕ(FSX(s))=FSY(s)\phi(F_S^X(s)) = F_S^Y(s), ϕ(F¬X(x))=F¬Y(ϕ(x))\phi(F_\neg^X(x)) = F_\neg^Y(\phi(x)), ϕ(FX(x1,x2))=FY(ϕ(x1),ϕ(x2))\phi(F_\wedge^X(x_1,x_2)) = F_\wedge^Y(\phi(x_1),\phi(x_2)), etc.