Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: deprecated: mathematics

Topic: Nakayama's lemma


view this post on Zulip John Baez (Aug 30 2023 at 08:31):

I never took a course on commutative algebra because in grad school I was mainly interested in mathematical physics, so I'm just finally now learning Nakayama's lemma - because I need it for something.

view this post on Zulip John Baez (Aug 30 2023 at 08:34):

There are many alternative formulations, some perhaps easier to remember than others. MathOverflow someone gave this nice mnemonic for it which happens to match what I actually need:

IM=M    im=m I M = M \implies i m = m

What does this mean? It means:

Nakayama's Lemma. Let II be an ideal in a commutative ring RR and let MM be a finitely generated RR-module. If IM=MIM=M then there exists iIi \in I such that im=mim = m for all mMm \in M.

view this post on Zulip John Baez (Aug 30 2023 at 08:37):

This instantly implies something I wanted to know: if II is an ideal in a commutative ring RR with I2=II^2 = I, then I=pRI = p R for some pRp \in R with p2=pp^2 = p.

view this post on Zulip John Baez (Aug 30 2023 at 08:43):

Starting from this simple formulation I'm finally getting more interested in the many alternative formulations and consequences, some of which sound more conceptual, like

If f:MMf: M \to M is a surjective endomorphism of a finitely generated RR-module, then ff is an isomorphism.

view this post on Zulip John Baez (Aug 30 2023 at 08:44):

or

Any basis of the fiber of a coherent sheaf at a point extends to a minimal generating set of local sections.

view this post on Zulip John Baez (Aug 30 2023 at 08:46):

or

A finitely generated module over a local ring is projective only if it is free.

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 11:44):

John Baez said:

This instantly implies something I wanted to know: if II is an ideal in a commutative ring RR with I2=II^2 = I, then I=pRI = p R for some pRp \in R with p2=pp^2 = p.

I don't understand how you get this instantly. And is it even true? I've seen it written on the internet with the additional hypothesis that II is finitely generated.

view this post on Zulip James Deikun (Aug 30 2023 at 11:51):

The ideal does have to be finitely generated to apply Nakayama's Lemma, indeed. But otherwise it's immediate, as:

view this post on Zulip Todd Trimble (Aug 30 2023 at 12:19):

John Baez said:

Starting from this simple formulation I'm finally getting more interested in the many alternative formulations and consequences, some of which sound more conceptual, like

If f:MMf: M \to M is a surjective endomorphism of a finitely generated RR-module, then ff is an isomorphism.

Yes, this is a good one. Nakayama's lemma seems to be like Yoneda's lemma, being one of these simple results that people in the know use all the time and is coincidentally named after a Japanese mathematician.

I'm not particularly in the know, but over a decade ago I wrote at the nLab, "Nakayama’s lemma is a simple but fundamental result of commutative algebra frequently used to lift information from the fiber of a sheaf over a point (as for example a coherent sheaf over a scheme) to give information on the stalk at that point".

I began writing the article after getting some glimmers of what it was really about (in the minds of people who really use the stuff) from this MathOverflow thread, I think especially the answer by Roy Smith. I especially like the application to the algebraic geometry form of the inverse function theorem.

view this post on Zulip John Baez (Aug 30 2023 at 12:34):

James Deikun said:

The ideal does have to be finitely generated to apply Nakayama's Lemma, indeed.

Yes, I meant to say "finitely generated ideal".

Here's a nice counterexample when your ideal is not finitely generated. Let RR be the ring of continuous complex-valued functions on [0,1][0,1] and let II be the ideal consisting of functions that vanish at 00. Then I2=II^2 = I but II is not of the form pRp R for any idempotent pRp \in R (which would need to be a continuous function that only takes the values 00 and/or 11).

view this post on Zulip John Baez (Aug 30 2023 at 12:38):

Todd Trimble said:

Nakayama's lemma seems to be like Yoneda's lemma, being one of these simple results that people in the know use all the time and is coincidentally named after a Japanese mathematician.

I was thinking about that. But what I'd really like to do is derive Nakayama's lemma from the Yoneda lemma! Then I could annoy people by saying "Nakayama's lemma? That's just a corollary of the Yoneda lemma."

view this post on Zulip John Baez (Aug 30 2023 at 12:41):

Now, it may be hopeless to derive the Nakayama lemma from the Yoneda lemma. But the Nakayama lemma does follow from ideas connected to the Cayley-Hamilton theorem, and that theorem has a vaguely Yoneda-ish feel, at least in my fevered brain.

view this post on Zulip John Baez (Aug 30 2023 at 12:42):

Remember, it says that if you take a square matrix AA and define its characteristic polynomial pp by

p(λ)=det(λIA)p(\lambda) = \mathrm{det}(\lambda I - A)

then you get

p(A)=0 p(A) = 0

view this post on Zulip John Baez (Aug 30 2023 at 12:44):

There's some sort of weird self-reference here, like "AA is the walking root of its own characteristic polynomial".

view this post on Zulip John Baez (Aug 30 2023 at 12:45):

And that "walking" business feels vaguely Yoneda-esque (to my fevered brain).

view this post on Zulip John Baez (Aug 30 2023 at 12:45):

Here's a fake one-line proof of the Cayley-Hamilton theorem:

p(A)=det(AIA)=0p(A) = \mathrm{det}(A I - A) = 0

view this post on Zulip John Baez (Aug 30 2023 at 12:46):

Puzzle for beginners: spot why this proof is fake.

Puzzle for experts: try to morph it into a real proof!

view this post on Zulip Steve Huntsman (Aug 30 2023 at 12:54):

John Baez said:

Puzzle for beginners: spot why this proof is fake.

Puzzle for experts: try to morph it into a real proof!

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 13:37):

I'd be glad to see (or better find) a proof which use modern tools, ie. that I could understand easily (in my head, the more there is structure and the less there are tricks, the easier it is to understand). A modern definition of the determinant is like this:

The definition of det(u)det(u) is then:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 13:52):

By definition, the characteristic polynomial p(u)p(u) of uu is then given by:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 13:56):

Let's verify that p(u)p(u) is really a polynomial (map) first.

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 13:59):

Λn(λ.1Eu)\Lambda^{n}(\lambda.1_E-u) is the map from Λn(E)\Lambda^n(E) to Λn(E)\Lambda^n(E) which acts like this on the basis

(1)(1)\quad e1...ene_{1}\wedge ... \wedge e_n \mapsto
(λ.1Eu)(e1)...(λ.1Eu)(en)=(λ.e1u(e1))...(λ.enu(en))(\lambda.1_E-u)(e_1)\wedge ... \wedge (\lambda.1_E-u)(e_n) = (\lambda.e_1-u(e_1)) \wedge ... \wedge (\lambda.e_n-u(e_n))

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 14:09):

The matrix (ak,l)1k,ln(a_{k,l})_{1 \le k,l \le n} of uu is defined by the equations
(2)u(ek)=1lnak,l.el1kn(2)\quad u(e_k) = \underset{1 \le l \le n}{\sum}a_{k,l}.e_{l}\quad\forall 1 \le k \le n

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 14:11):

Let me use (2)(2) into (1)(1). Now I get that Λn(λ.1Eu)\Lambda^{n}(\lambda.1_E-u) is the map from Λn(E)\Lambda^n(E) to Λn(E)\Lambda^n(E) which acts like this on the basis:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 14:12):

e1...ene_{1}\wedge ... \wedge e_n \mapsto
(λ.e11lna1,l.el)...(λ.en1lnan,l.el)(\lambda.e_1-\underset{1 \le l \le n}{\sum}a_{1,l}.e_{l}) \wedge ... \wedge (\lambda.e_n-\underset{1 \le l \le n}{\sum}a_{n,l}.e_{l})

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 14:15):

ie. Λn(λ.1Eu)(e1...en)\Lambda^{n}(\lambda.1_E-u)(e_1 \wedge ... \wedge e_n)
=(λ.e11lna1,l.el)...(λ.en1lnan,l.el)= (\lambda.e_1-\underset{1 \le l \le n}{\sum}a_{1,l}.e_{l}) \wedge ... \wedge (\lambda.e_n-\underset{1 \le l \le n}{\sum}a_{n,l}.e_{l})

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 14:29):

=((λa1,1).e1a1,2.e2...a1,nen)...(an,1.e1...an,n1en1+(λan,n).en)= ((\lambda-a_{1,1}).e_1- a_{1,2}.e_{2}-...-a_{1,n}e_n) \wedge ... \wedge (-a_{n,1}.e_{1}-...-a_{n,n-1}e_{n-1}+(\lambda-a_{n,n}).e_n)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 14:37):

=1i1,...,innci1,...,in.ei1...ein= \underset{1 \le i_1,...,i_n \le n}{\sum}c_{i_1,...,i_n}.e_{i_1}\wedge...\wedge e_{i_n}

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 14:38):

where ci1,...,inc_{i_1,...,i_n} is a uniquely determined scalar.

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 14:52):

()(*)\quad Remember that vσ(1)...vσ(n)=ϵ(σ).v1...vnv_{\sigma(1)} \wedge ... \wedge v_{\sigma(n)} = \epsilon(\sigma).v_{1}\wedge...\wedge v_{n} which gives in particular that v1...vn=0v_{1} \wedge ... \wedge v_{n}=0 when there is a couple 1i<jn1 \le i < j \le n such that vi=vjv_{i}=v_{j}.

view this post on Zulip Todd Trimble (Aug 30 2023 at 14:55):

For what it's worth, the Cayley-Hamilton theorem is proved in the nLab here, as a relatively easy consequence of Cramer's rule. By the way, Cayley-Hamilton holds over any commutative ring.

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 14:55):

Yeah, thanks. I'm just trying to do my proof (I do the puzzle).

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 14:57):

I prefer if there is no use of something like the Cramer's rule and if it follows from a direct computation by applying the definitions (that's probably a matter of taste). But I can't guarantee that it is not going to take me the whole day -- if it works.

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:02):

So, I continue my computation.

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:05):

Let me rewrite slightly differently what I wrote before (I deleted all the terms which are equal to 00 using ()(*) and enumerated the other ones using the symmetric group):

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:13):

Λn(λ.1Eu)(e1...en)\Lambda^{n}(\lambda.1_E-u)(e_1 \wedge ... \wedge e_n)
=σSncσ(1),...,σ(n).eσ(1)...eσ(n)=\underset{\sigma \in \mathfrak{S_n}}{\sum}c_{\sigma(1),...,\sigma(n)}.e_{\sigma(1)}\wedge ... \wedge e_{\sigma(n)}

view this post on Zulip Todd Trimble (Aug 30 2023 at 15:15):

John Baez said:

Todd Trimble said:

Nakayama's lemma seems to be like Yoneda's lemma, being one of these simple results that people in the know use all the time and is coincidentally named after a Japanese mathematician.

I was thinking about that. But what I'd really like to do is derive Nakayama's lemma from the Yoneda lemma! Then I could annoy people by saying "Nakayama's lemma? That's just a corollary of the Yoneda lemma."

That's a fun motivation: finding a new way to annoy people using the Yoneda lemma. JK

I don't have a lot to say at the moment about a connection between the two, but it does remind me of stuff that Andre Scedrov said near the beginning of his AMS Memoir Forcing and Classifying Topoi. On page 11, he cites one of the standard approaches to proving the Cayley-Hamilton theorem as an example of universal thinking: "one always takes only the bare essentials needed to satisfy the hypothesis". So here, he says, "If ... PM(t)=det(tIM)P_M(t) = \det(tI - M), then showing PM(M)=0P_M(M) = 0 at once is too hard. Rather, think of tIMtI - M as a matrix over k[t]k[t], EE as a module over k[t]k[t], and show PM(t)E=0P_M(t)E = 0 in this setting."

Where is he going with this? He's making an analogy with the process also described by Lawvere in his Variable Sets, Etendu, and Variable Structures in Topoi: just as one goes from a ring of constants kk to k[t]k[t] by adjoining a variable, so one can move from "constant sets" to more "variable sets" (e.g., presheaf toposes), and then the forcing process (localizing with respect to forcing conditions) is analogous to taking a quotient modulo an ideal, tailor-made to fit the hypotheses one is after.

So there is the idea of using generic elements. In the Yoneda lemma, one considers an identity morphism as a "generic generalized element".

That's all I got at the moment... wish I had more.

view this post on Zulip Todd Trimble (Aug 30 2023 at 15:17):

Jean-Baptiste Vienney said:

Yeah, thanks. I'm just trying to do my proof (I do the puzzle).

No, I know, and sorry to interrupt -- I thought you were done for the moment. Actually, I wasn't mainly talking to you, I was talking to everybody. Carry on...

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:18):

Todd Trimble said:

Jean-Baptiste Vienney said:

Yeah, thanks. I'm just trying to do my proof (I do the puzzle).

No, I know, and sorry to interrupt -- I thought you were done for the moment. Actually, I wasn't mainly talking to you, I was talking to everybody. Carry on...

Ok :sweat_smile: . I wasn't sure if you were annoyed by my try or not. I take pauses.

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:19):

Actually, I've never had a good understanding of this theorem and I didn't know exterior powers or even the tensor product when I learned it so I'm trying to see what I can do today using these tools (by the way I haven't used the hypothesis that the scalars are in a field, maybe the one that r+r0r+r\neq 0 if r0r \neq 0, not sure)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:24):

I use again ()(*) to get:
Λn(λ.1Eu)(e1...en)\Lambda^{n}(\lambda.1_E-u)(e_1 \wedge ... \wedge e_n)
=σSnϵ(σ).cσ(1),...,σ(n).e1...en=\underset{\sigma \in \mathfrak{S_n}}{\sum}\epsilon(\sigma).c_{\sigma(1),...,\sigma(n)}.e_{1}\wedge ... \wedge e_{n}

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:26):

So, we have:
(3)p(u)(λ)=σSnϵ(σ).cσ(1),...,σ(n)(3) \quad p(u)(\lambda)=\underset{\sigma \in \mathfrak{S_n}}{\sum}\epsilon(\sigma).c_{\sigma(1),...,\sigma(n)}

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:30):

The ci1,...,inc_{i_{1},...,i_{n}} are polynomials in λ\lambda (of degree n\le n), which depend on the matrix of uu, although I don't want to write down the expression explicitely...

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:32):

So p(u)(λ)p(u)(\lambda) is really a polynomial (of degree nn) in λ\lambda.

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:33):

Now, we want to prove that p(u)(u)=0p(u)(u)=0

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:43):

Well, I can't make any further progress if I don't work on the coefficients cσ(1),...,σ(n)c_{\sigma(1),...,\sigma(n)}

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 15:54):

I can decompose
(4)cσ(1),...,σ(n)=d1σ...dnσ(4) \quad c_{\sigma(1),...,\sigma(n)}=d^{\sigma}_1...d^{\sigma}_n

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:04):

Jean-Baptiste Vienney said:

=((λa1,1).e1a1,2.e2...a1,nen)...(an,1.e1...an,n1en1+(λan,n).en)= ((\lambda-a_{1,1}).e_1- a_{1,2}.e_{2}-...-a_{1,n}e_n) \wedge ... \wedge (-a_{n,1}.e_{1}-...-a_{n,n-1}e_{n-1}+(\lambda-a_{n,n}).e_n)

I'm gonna try to compute some coefficients here to get an idea

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:06):

The coeff before e1...ene_{1}\wedge...\wedge e_n is (λa1,1)(λa2,2)...(λan,n)(\lambda-a_{1,1})(\lambda-a_{2,2})...(\lambda-a_{n,n})

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:08):

The coeff before e1e3e2e_{1} \wedge e_{3} \wedge e_{2} (if we had n=3n=3) would be (λa1,1)(a2,3)(a3,2)(\lambda-a_{1,1})(-a_{2,3})(-a_{3,2})

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:10):

The coeff before e3e2e1e_{3}\wedge e_{2} \wedge e_{1} would be (a1,3)(λa2,2)(a3,1)(-a_{1,3})(\lambda-a_{2,2})(-a_{3,1})

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:10):

So, there is a pattern

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:12):

The coeff before eσ(1)...eσ(n)e_{\sigma(1)} \wedge ...\wedge e_{\sigma(n)} is my cσ(1),...,σ(n)=d1σ...dnσc_{\sigma(1),...,\sigma(n)}=d_{1}^{\sigma}...d_{n}^{\sigma} where for every 1in1 \le i \le n:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:14):

(5a)diσ=(ai,σ(i))(5a)\quad d_{i}^{\sigma}=(-a_{i,\sigma(i)}) if iσ(i)i \neq \sigma(i)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:14):

(5b)diσ=(λai,σ(i))(5b) \quad d_{i}^{\sigma}=(\lambda-a_{i,\sigma(i)}) if i=σ(i)i = \sigma(i)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:31):

Let's combine (3),(4),(5)(3),(4),(5):

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:35):

(6)p(u)(λ)=σSnϵ(σ)1in, iσ(i)(ai,σ(i))1in, i=σ(i)(λai,σ(i))(6) \quad p(u)(\lambda) = \underset{\sigma \in \mathfrak{S}_n}{\sum}\epsilon(\sigma)\underset{1 \le i \le n, ~i \neq \sigma(i)}{\prod}(-a_{i,\sigma(i)}) \underset{1 \le i \le n,~ i = \sigma(i)}{\prod}(\lambda-a_{i,\sigma(i)})

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:36):

Now, it's very clear that p(u)(λ)p(u)(\lambda) is a polynomial!

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 16:39):

We thus want to prove that:
(7)σSnϵ(σ)1in, iσ(i)(ai,σ(i))1in, i=σ(i)(uai,σ(i).1E)=0(7)\quad \underset{\sigma \in \mathfrak{S}_n}{\sum}\epsilon(\sigma)\underset{1 \le i \le n, ~i \neq \sigma(i)}{\prod}(-a_{i,\sigma(i)}) \underset{1 \le i \le n,~ i = \sigma(i)}{\prod}(u-a_{i,\sigma(i)}.1_E)=0

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 17:09):

Be aware that the second product must be interpreted as a composite...

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 17:14):

Let's note (e1,...,en)(e_{1}^*,...,e_{n}^*) the dual basis of (e1,...,en)(e_1,...,e_n)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 17:31):

I'm going to use the natural isomorphism End(E)EEEnd(E) \cong E \otimes E^*

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 17:33):

which sends an endomorphism vv to 1k,lnel(v(ek)).elek\underset{1 \le k,l \le n}{\sum}e_l^*(v(e_k)).e_{l}\otimes e_k^*

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 17:39):

(Note that el(v(ek)))e_{l}^*(v(e_k))) is just the coefficients (k,l)(k,l) in the matrix of vv)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 17:39):

In fact it is also an isomorphism of kk-algebras

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 17:44):

The product in EEE \otimes E^* is given by (vϕ)(wψ)ϕ(w).vψ(v \otimes \phi) \otimes (w \otimes \psi) \mapsto \phi(w).v \otimes \psi

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 17:46):

(The idea is that ψ\psi absorbs the initial entry, ww is an intermediate output, ϕ\phi absorbs this intermediate output, and then vv is a final output, that's why you get the coefficient ϕ(w)\phi(w))

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 17:50):

I'm going to rewrite the LHS of (7)(7) as the nn-ary product applied to an element of (EE)n(E \otimes E^*)^{\otimes n} ie. I will explicit an element y(EE)ny \in (E \otimes E^*)^{\otimes n} such that if we consider the nn-ary product n:(EE)n(EE)\nabla^n:(E \otimes E^*)^{\otimes n} \rightarrow (E \otimes E^*), then the LHS of (7)(7) is equal to n(y)\nabla^n(y)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 17:54):

I do that because it feels too hard to compute the composite/product directly

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 17:57):

but it will still replace uu by coefficients of its matrix + basis and dual basis elements in this LHS, so it should be better

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:03):

Yeah, @James Deikun help me!! (or if you have something else in the same vein)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:04):

I changed the product because in (7)(7) you compose with \circ

view this post on Zulip James Deikun (Aug 30 2023 at 18:06):

Let's see:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:10):

Under my isomorphism of kk-algebras, the fact (7)(7) that I want to prove is now equivalent to proving that this vector is equal to 00:
(8)(8)
n(σSnϵ(σ)1in, iσ(i)(ai,σ(i))1in, i=σ(i)(1k,lnel(u(ek)).elekai,σ(i).1mnemem)) \nabla^n\bigg(\underset{\sigma \in \mathfrak{S}_n}{\sum}\epsilon(\sigma)\underset{1 \le i \le n, ~i \neq \sigma(i)}{\prod}(-a_{i,\sigma(i)}) \underset{1 \le i \le n,~ i = \sigma(i)}{\bigotimes}(\underset{1 \le k,l \le n}{\sum}e_l^*(u(e_k)).e_{l}\otimes e_k^*-a_{i,\sigma(i)}.\underset{1 \le m \le n}{\sum}e_{m}\otimes e_{m}^*)\bigg)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:11):

but recall that the el(u(ek))=ak,le_{l}^*(u(ek))=a_{k,l} are just the coefficients of the matrix of uu

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:14):

so this vector is equal to:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:14):

(9)(9)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:14):

n(σSnϵ(σ)1in, iσ(i)(ai,σ(i))1in, i=σ(i)(1k,lnak,l.elekai,σ(i).1mnemem)) \nabla^n\bigg(\underset{\sigma \in \mathfrak{S}_n}{\sum}\epsilon(\sigma)\underset{1 \le i \le n, ~i \neq \sigma(i)}{\prod}(-a_{i,\sigma(i)}) \underset{1 \le i \le n,~ i = \sigma(i)}{\bigotimes}(\underset{1 \le k,l \le n}{\sum}a_{k,l}.e_{l}\otimes e_k^*-a_{i,\sigma(i)}.\underset{1 \le m \le n}{\sum}e_{m}\otimes e_{m}^*)\bigg)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:18):

Bceause everything is linear, I can move all the coefficients out of the product n\nabla^n. Let's do it in several steps.

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:19):

First, (9)(9) is equal to:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:20):

(10)(10)
σSnϵ(σ)1in, iσ(i)(ai,σ(i)).n(1in, i=σ(i)(1k,lnak,l.elekai,σ(i).1mnemem))\underset{\sigma \in \mathfrak{S}_n}{\sum}\epsilon(\sigma)\underset{1 \le i \le n, ~i \neq \sigma(i)}{\prod}(-a_{i,\sigma(i)}).\nabla^n\bigg( \underset{1 \le i \le n,~ i = \sigma(i)}{\bigotimes}(\underset{1 \le k,l \le n}{\sum}a_{k,l}.e_{l}\otimes e_k^*-a_{i,\sigma(i)}.\underset{1 \le m \le n}{\sum}e_{m}\otimes e_{m}^*)\bigg)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:27):

I'm going to put all the coefficients before vectors of the type ememe_{m}\otimes e_{m}^* together in the above expression. I obtain:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:31):

σSnϵ(σ)1in, iσ(i)(ai,σ(i)).n(1in, i=σ(i)(1klnak,l.elek+1mn(am,mai,σ(i)).emem))\underset{\sigma \in \mathfrak{S}_n}{\sum}\epsilon(\sigma)\underset{1 \le i \le n, ~i \neq \sigma(i)}{\prod}(-a_{i,\sigma(i)}).\nabla^n\bigg( \underset{1 \le i \le n,~ i = \sigma(i)}{\bigotimes}(\underset{1 \le k \neq l \le n}{\sum}a_{k,l}.e_{l}\otimes e_k^*+\underset{1 \le m \le n}{\sum}(a_{m,m}-a_{i,\sigma(i)}).e_{m}\otimes e_{m}^*)\bigg)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:32):

I just do a tiny simplification by replacing σ(i)\sigma(i) by ii when they are equal. I obtain:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:32):

σSnϵ(σ)1in, iσ(i)(ai,σ(i)).n(1in, i=σ(i)(1klnak,l.elek+1mn(am,mai,i).emem))\underset{\sigma \in \mathfrak{S}_n}{\sum}\epsilon(\sigma)\underset{1 \le i \le n, ~i \neq \sigma(i)}{\prod}(-a_{i,\sigma(i)}).\nabla^n\bigg( \underset{1 \le i \le n,~ i = \sigma(i)}{\bigotimes}\Big(\underset{1 \le k \neq l \le n}{\sum}a_{k,l}.e_{l}\otimes e_k^*+\underset{1 \le m \le n}{\sum}(a_{m,m}-a_{i,i}).e_{m}\otimes e_{m}^*\Big)\bigg)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:38):

Clearly, the thing to do now is to develop the big tensor product inside the n\nabla^n

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:48):

I want to apply a distributivity like this (where II is a finite set):
iI(yi1+yi2)=α:I{1,2}  iI yiα(i)\underset{i \in I}{\bigotimes} (y_i^1 + y_i^2) = \underset{\alpha:I \rightarrow \{1,2\}}{\sum}~~\underset{i \in I}{\bigotimes}~y_{i}^{\alpha(i)}

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:50):

Here II is {1in, i=σ(i)}\{1 \le i \le n, ~i = \sigma(i)\}

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:53):

yi1y_{i}^{1} is 1klnak,l.elek\underset{1 \le k \neq l \le n}{\sum}a_{k,l}.e_{l}\otimes e_k^*

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:54):

yi2y_i^2 is 1mn(am,mai,i).emem\underset{1 \le m \le n}{\sum}(a_{m,m}-a_{i,i}).e_{m}\otimes e_{m}^*

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 18:56):

I want to find, given a function α:I{1,2}\alpha:I \rightarrow \{1,2\}, how I can write iI yiα(i)\underset{i \in I}{\bigotimes}~y_{i}^{\alpha(i)} in a good way (with II, yi1y_{i}^{1}, yi2y_{i}^{2} as above)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:01):

yi1=:y1y_{i}^{1} =: y^1 doesn't depend on ii here

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:02):

so that iI yiα(i)=(y1)α1(1)iα1(2) yi2\underset{i \in I}{\bigotimes}~y_{i}^{\alpha(i)} = (y^1)^{\otimes |\alpha^{-1}(1)|} \otimes \underset{i \in \alpha^{-1}(2)}{\bigotimes}~y_{i}^{2}

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:04):

ie. by replacing y1y^1 and yi2y_i^2 by their respective expression:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:05):

iI yiα(i)=(1klnak,l.elek)α1(1)iα1(2) (1mn(am,mai,i).emem)\underset{i \in I}{\bigotimes}~y_{i}^{\alpha(i)} = (\underset{1 \le k \neq l \le n}{\sum}a_{k,l}.e_{l}\otimes e_k^*)^{\otimes |\alpha^{-1}(1)|} \otimes \underset{i \in \alpha^{-1}(2)}{\bigotimes}~(\underset{1 \le m \le n}{\sum}(a_{m,m}-a_{i,i}).e_{m}\otimes e_{m}^*)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:06):

So that I get
iI(yi1+yi2)=α:I{1,2} (1klnak,l.elek)α1(1)iα1(2) (1mn(am,mai,i).emem)\underset{i \in I}{\bigotimes} (y_i^1 + y_i^2) = \underset{\alpha:I \rightarrow \{1,2\}}{\sum}~(\underset{1 \le k \neq l \le n}{\sum}a_{k,l}.e_{l}\otimes e_k^*)^{\otimes |\alpha^{-1}(1)|} \otimes \underset{i \in \alpha^{-1}(2)}{\bigotimes}~(\underset{1 \le m \le n}{\sum}(a_{m,m}-a_{i,i}).e_{m}\otimes e_{m}^*)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:08):

I just replace the II by its expression:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:08):

iI(yi1+yi2)=α:{1in, i=σ(i)}{1,2} (1klnak,l.elek)α1(1)iα1(2) (1mn(am,mai,i).emem)\underset{i \in I}{\bigotimes} (y_i^1 + y_i^2) = \underset{\alpha:\{1 \le i \le n, ~i = \sigma(i)\} \rightarrow \{1,2\}}{\sum}~(\underset{1 \le k \neq l \le n}{\sum}a_{k,l}.e_{l}\otimes e_k^*)^{\otimes |\alpha^{-1}(1)|} \otimes \underset{i \in \alpha^{-1}(2)}{\bigotimes}~(\underset{1 \le m \le n}{\sum}(a_{m,m}-a_{i,i}).e_{m}\otimes e_{m}^*)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:09):

And I put this in (10)(10) to get:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:09):

(11)(11)
σSnϵ(σ)1in, iσ(i)(ai,σ(i)).n(α:{1in, i=σ(i)}{1,2} (1klnak,l.elek)α1(1)iα1(2) (1mn(am,mai,i).emem))\underset{\sigma \in \mathfrak{S}_n}{\sum}\epsilon(\sigma)\underset{1 \le i \le n, ~i \neq \sigma(i)}{\prod}(-a_{i,\sigma(i)}).\nabla^n\bigg(\underset{\alpha:\{1 \le i \le n, ~i = \sigma(i)\} \rightarrow \{1,2\}}{\sum}~(\underset{1 \le k \neq l \le n}{\sum}a_{k,l}.e_{l}\otimes e_k^*)^{\otimes |\alpha^{-1}(1)|} \otimes \underset{i \in \alpha^{-1}(2)}{\bigotimes}~(\underset{1 \le m \le n}{\sum}(a_{m,m}-a_{i,i}).e_{m}\otimes e_{m}^*)\bigg)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:09):

Ouch

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:11):

I recall that the Cayley-Hamilton theorem is equivalent to: for every u:EEu:E \rightarrow E, if (e1,...,en)(e_{1},...,e_{n}) is a basis of EE, and (ak,l)1k,ln(a_{k,l})_{1 \le k,l \le n} the matrix of uu then (11)(11) is equal to 00...

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:15):

I can move the first sum out of n\nabla^n to get:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:16):

(12)(12)
σSnϵ(σ)1in, iσ(i)(ai,σ(i))α:{1in, i=σ(i)}{1,2} n((1klnak,l.elek)α1(1)iα1(2) (1mn(am,mai,i).emem))\underset{\sigma \in \mathfrak{S}_n}{\sum}\epsilon(\sigma)\underset{1 \le i \le n, ~i \neq \sigma(i)}{\prod}(-a_{i,\sigma(i)})\underset{\alpha:\{1 \le i \le n, ~i = \sigma(i)\} \rightarrow \{1,2\}}{\sum}~\nabla^n\bigg((\underset{1 \le k \neq l \le n}{\sum}a_{k,l}.e_{l}\otimes e_k^*)^{\otimes |\alpha^{-1}(1)|} \otimes \underset{i \in \alpha^{-1}(2)}{\bigotimes}~(\underset{1 \le m \le n}{\sum}(a_{m,m}-a_{i,i}).e_{m}\otimes e_{m}^*)\bigg)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:18):

I still must use distributivities to move stuff outside of n\nabla^n and finally be able to perform the multiplication

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:30):

I must distribute here:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:30):

(1klnak,l.elek)α1(1)(\underset{1 \le k \neq l \le n}{\sum}a_{k,l}.e_{l}\otimes e_k^*)^{\otimes |\alpha^{-1}(1)|}

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:31):

and here:

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:31):

iα1(2) (1mn(am,mai,i).emem)\underset{i \in \alpha^{-1}(2)}{\bigotimes}~(\underset{1 \le m \le n}{\sum}(a_{m,m}-a_{i,i}).e_{m}\otimes e_{m}^*\big)

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:37):

In the first expression, I must use the multinomial theorem:

view this post on Zulip Simon Burton (Aug 30 2023 at 19:40):

image.png
This is the reference @Todd Trimble gave above. So now I'm thinking of k[t]k[t] as "the walking endomorphism" ring...

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:44):

Sorry, you can discuss. I still hope that that my infinite calculation can terminate (if ever)

view this post on Zulip Todd Trimble (Aug 30 2023 at 19:45):

Jean-Baptiste Vienney said:

I prefer if there is no use of something like the Cramer's rule and if it follows from a direct computation by applying the definitions (that's probably a matter of taste). But I can't guarantee that it is not going to take me the whole day -- if it works.

With regard to Cramer's rule: it's an easy consequence of the fact that the determinant is an alternating multilinear map Vn=V××VkV^n = V \times \ldots \times V \to k. In other words, a direct computation by applying the definition of determinant.

But the only reason for mentioning it is that it leads directly to the adjugate matrix of a matrix AA, viz. a matrix A~\widetilde{A} such that AA~=A~A=det(A)IA \widetilde{A} = \widetilde{A} A = \det(A) \cdot I. The presence of such a matrix A~\widetilde{A} is the key thing used in the nLab to prove Cayley-Hamilton, which has a nice short conceptual proof (essentially the same proof as in Lang's Algebra, if I remember correctly).

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:51):

It looks more intelligent. I'm still intrigued to know if my brute-force calculation can finish

view this post on Zulip Todd Trimble (Aug 30 2023 at 19:55):

I'm all for experimentation, but I think it would be better to work these things out by yourself and not go through a long public calculation. I think the main problem may be that not many people here would be willing to read through all of it, and besides, people were beginning to have a conversation about Nakayama's lemma.

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:56):

Mhh I agree

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 19:57):

I'll try to finish this outside of this conversation

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 20:05):

Sorry for the spam with gigantic tensors

view this post on Zulip Patrick Nicodemus (Aug 30 2023 at 20:12):

Paul Garrett has a proof of Cayley Hamilton at the end of his algebra book which I find pretty interesting.

view this post on Zulip James Deikun (Aug 30 2023 at 20:14):

Is there any good proof that doesn't make use of adjugates?

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 20:17):

Patrick Nicodemus said:

Paul Garrett has a proof of Cayley Hamilton at the end of his algebra book which I find pretty interesting.

I think I would like it because he uses exterior powers, the adjugate, but with an endomorphism and not a matrix, and also modules over k[x]k[x] like mentionned above

view this post on Zulip Patrick Nicodemus (Aug 30 2023 at 20:20):

James Deikun said:

Is there any good proof that doesn't make use of adjugates?

Adjugates can be motivated pretty well from a more abstract algebraic point of view, they don't have to be defined purely in terms of some formula that acts on a matrix. I can elaborate on this

view this post on Zulip Jean-Baptiste Vienney (Aug 30 2023 at 20:21):

Can you say more?

view this post on Zulip Todd Trimble (Aug 30 2023 at 20:26):

James Deikun said:

Is there any good proof that doesn't make use of adjugates?

There's a kind of "dirty" proof where one sees that diagonal matrices DD solve the characteristic polynomial equation (since p(D)p(D) for a polynomial pp is a diagonal matrix whose diagonal entries are p(di)p(d_i), where did_i are the entries for DD). Since the determinant is invariant under conjugation, we can then extend Cayley-Hamilton to the set of all diagonalizable matrices. Then, since diagonalizable matrices are dense in the set of all matrices, and since the set of matrices that satisfy CH is a closed set (even in the Zariski topology), the Cayley-Hamilton holds for all matrices.

One has to think a bit carefully though how well this holds up for all commutative rings. I'm not sure that density assertion does.

view this post on Zulip Todd Trimble (Aug 30 2023 at 20:53):

Ack, diagonalizable matrices are not dense of course. I guess I had in mind the classical case where the commutative ring is a field, and apply the idea that the statement of CH is insensitive to whether we stick with that field or pass to an extension field, and use this to pass to an algebraic closure and argue there. So, yeah, it's looking dirtier by the moment.

view this post on Zulip Patrick Nicodemus (Aug 30 2023 at 21:50):

I consider free modules of finite rank over a given ring RR.

Fix a rank 1 free module AA (not necessarily with a given choice of isomorphism with RR) and let ()(-) ^\ast abbreviate the functor Hom(,A)Hom(-, A).
A perfect pairing VWAV\otimes W\to A is a map such that both ways of currying give isomorphisms, so VWV\cong W^\ast and WVW\cong V^\ast.
For any perfect pairing σ:VW\sigma:V\cong W^\ast, and for TT an endomorphism of VV, construct a map

(σTσ1):WW(\sigma T\sigma^{-1})^{\ast} :W^{\ast\ast} \to W^{\ast\ast}
and by virtue of the natural isomorphism WWW\cong W^{\ast\ast} i can regard this as a map T:WWT^\sharp:W\to W.

This map is called the adjoint of TT and is uniquely defined by Tv,w=v,Tw\left\langle Tv, w\right\rangle=\left\langle v, T^\sharp w\right\rangle . Similarly if FF is an endomorphism of WW its adjoint is an endomorphism of VV.

Now take VV to be a free module of rank nn, let W=Λn1(V)W=\Lambda^{n-1}(V)
and take AA to be Λn(V)\Lambda^{n}(V). The wedge product is a perfect pairing.

If TT is any endomorphism of VV then because the exterior algebra is a functor F=Λn1(T)F=\Lambda^{n-1}(T) is an endomorphism of W=Λn1(V)W=\Lambda^{n-1}(V). So it has an adjoint FF^\sharp which is an endomorphism of VV. I call this endomorphism the adjugate of TT.

Its defining property is that it satisfies
F(v)v2vn=vT(v2)T(vn)F^\sharp(v) \land v_2\land \dots\land v_n=v\land T(v_2)\land\dots\land T(v_n).

For any basis of VV there are standard bases for WW and AA. Moreover as soon as a basis is fixed VV is isomorphic to its dual space and so are WW and AA. I can describe the computation of the matrices arising from these but it will be a bit tedious, i will leave it as an exercise

view this post on Zulip Patrick Nicodemus (Aug 30 2023 at 21:55):

it's helpful here to know that if TT is an endomorphism of VV then its determinant can be defined as the unique scalar rr in RR such that Λn(T)\Lambda^n(T) is equal to rr\cdot - as endomorphisms of Λn(V)\Lambda^n(V).

view this post on Zulip Patrick Nicodemus (Aug 30 2023 at 21:55):

that's why the adjugate is closely connected to the determinant.

view this post on Zulip James Deikun (Aug 30 2023 at 21:58):

From this perspective is there a way to make the adjugate still work for modules that are not free? Apparently the Cayley-Hamilton theorem itself still works in this situation, just as long as the ring is commutative.

view this post on Zulip James Deikun (Aug 30 2023 at 21:59):

(This is much nicer than picking a basis and manipulating a bunch of formulas though.)

view this post on Zulip Patrick Nicodemus (Aug 30 2023 at 22:04):

i don't know of a reformulation of the proof where you get it for free but it's a corollary. If MM is fg take a surjection ϕ:RnM\phi:R^n\to M, then T:MMT:M\to M lifts to T:RnRnT':R^n\to R^n such that the square commutes. then you apply the theorem to TT' and use this to get it for TT

view this post on Zulip John Baez (Aug 30 2023 at 22:29):

Todd Trimble said:

There's a kind of "dirty" proof where one sees that diagonal matrices DD solve the characteristic polynomial equation (since p(D)p(D) for a polynomial pp is a diagonal matrix whose diagonal entries are p(di)p(d_i), where did_i are the entries for DD). Since the determinant is invariant under conjugation, we can then extend Cayley-Hamilton to the set of all diagonalizable matrices. Then, since diagonalizable matrices are dense in the set of all matrices, and since the set of matrices that satisfy CH is a closed set (even in the Zariski topology), the Cayley-Hamilton holds for all matrices.

I kind of like dirty arguments where you prove something on a dense set and then use continuity to handle the rest. Maybe it's my background in analysis: these arguments come very naturally to me. For example, that's one of my favorite proofs of

exp(tr(A))=det(exp(A)) \exp(\mathrm{tr}(A)) = \det(\exp(A))

But you know this better than anyone, Todd - in our recent work I've been reaching for Zariski density arguments like a drunk reaches for his bottle. :upside_down:

view this post on Zulip John Baez (Aug 30 2023 at 22:33):

I do however agree that there's something unsatisfying about proving an equation between algebraic functions by proving it at just enough points to let a density argument do the rest: in doing so one may be missing the deeper insights required to prove it more directly.

view this post on Zulip David Michael Roberts (Aug 31 2023 at 02:06):

Todd Trimble said:

Ack, diagonalizable matrices are not dense of course. I guess I had in mind the classical case where the commutative ring is a field, and apply the idea that the statement of CH is insensitive to whether we stick with that field or pass to an extension field, and use this to pass to an algebraic closure and argue there. So, yeah, it's looking dirtier by the moment.

I'm wondering if there is an analogue of Smith normal form/Jordan canonical form for say PID entries (not like SNF, where you can pick different bases on either side, though), and then prove Cayley–Hamilton for that special form.

view this post on Zulip David Michael Roberts (Aug 31 2023 at 02:07):

Or even just the pedestrian step of reducing to the case from a block diagonal matrix to the blocks separately, and then thinking about what happens with a single Jordan block, say (or the analogue not over a field).

view this post on Zulip David Michael Roberts (Aug 31 2023 at 02:11):

Obviously when working over a more general commutative ring you almost surely have problems in finding invariant submodules and more options for what submodules can be. Checking now, I see that "hereditary" rings are those where submodules of free modules are projective, so things can get pretty weird, and this is probably not a path all the way to a general proof.

view this post on Zulip James Deikun (Aug 31 2023 at 06:27):

In general I don't find any of the proofs of Cayley-Hamilton very conceptual, and none of them seem to prove it in the full generality where it holds. In particular, any proof based on the adjugate doesn't seem like it would work for the case of quaternionic or split-quaternionic matrices. (How do you actually define determinant of quaternionic matrices anyway, BTW?)

view this post on Zulip John Baez (Aug 31 2023 at 07:58):

Does Cayley-Hamilton even hold for quaternionic matrices?

I don't think there's a quaternion-valued determinant of quaternionic matrices obeying det(gh) = det(g) det(h). (It would take more conditions to turn this thought into a theorem.)

The good determinants for quaternionic matrices are the Dieudonne determinant and the Study determinant, which works out to be the square of the Dieudonne determinant. I wrote a bunch about these for the nLab.

view this post on Zulip John Baez (Aug 31 2023 at 07:59):

The Dieudonne determinant is an idea that works for matrices valued in any division ring.

view this post on Zulip James Deikun (Aug 31 2023 at 11:42):

According to https://www.sciencedirect.com/science/article/pii/0024379595005439 it's a little more complicated than the paradigm case; the "characteristic polynomial" here detects right eigenvalues (Ax=xλAx = x\lambda) and is obtained by intervening in the process of obtaining the Study determinant by doing the IλAI\lambda - A manipulation just before taking the complex determinant. It is a polynomial of degree 2n2n with real coefficients, and its complex roots come in conjugate pairs, even the real ones, corresponding to nn "conjugate 3-spheres" of right eigenvalues, counting multiplicity. As a real polynomial, it is pretty clear how to evaluate it even on a quaternionic matrix and the evaluation on AA turns out to be 0.

view this post on Zulip Patrick Nicodemus (Aug 31 2023 at 11:47):

@Jean-Baptiste Vienney pinging you as you requested this longer explanation.

view this post on Zulip James Deikun (Aug 31 2023 at 12:49):

Garibaldi in https://arxiv.org/pdf/math/0203276.pdf constructs the characteristic polynomial as basically the "stable minimal polynomial", encoding the linear dependencies among powers of AA that are not "coincidental". In this setting Cayley-Hamilton is basically trivial; the interesting thing is proving that the characteristic polynomial detects eigenvalues and obeys the conventional formula.

view this post on Zulip Aaron David Fairbanks (Sep 01 2023 at 03:12):

Here are string diagrams based on @Patrick Nicodemus's description of adjugates, in case this helps anyone else too.
adj1.png
adj2.png
adj3.png

view this post on Zulip Jean-Baptiste Vienney (Sep 01 2023 at 03:43):

Lovely! We can easily draw the identity defining the characteristic polynomial by an equality between 2 string diagrams too. p(λ)=det(λ.1Eu)p(\lambda) = det(\lambda.1_E - u) so p(λ)p(\lambda) is the unique scalar such that:
Λn(λ.1Eu)=p(λ).1Λn(E)\Lambda^n(\lambda.1_E - u) = p(\lambda). 1_{\Lambda^n(E)}

view this post on Zulip Jean-Baptiste Vienney (Sep 01 2023 at 03:44):

Now, I'm wondering how you can draw the identity which is named "Cayley-Hamilton theorem"

view this post on Zulip Jean-Baptiste Vienney (Sep 01 2023 at 03:45):

And if we can somehow draw the proof with equalities of string diagrams

view this post on Zulip James Deikun (Sep 02 2023 at 04:16):

From the above-referenced Garibaldi:

Let a1,...,ama_1, . . . , a_m be an FF -basis for AA. Let R=F[t1,...,tm]R = F [t_1, . . . , t_m] for t1,...,tmt_1, . . . , t_m (commuting) indeterminates, and let KK be the quotient field of RR. We call γ=itiaiAFKγ = \sum_i t_i a_i \in A \otimes_F K a generic element. The KK-span of 1,γ,γ2,...1, γ, γ^2, . . . is a subspace of AKA \otimes K, so it must be finite-dimensional over KK. Hence there is a nonzero monic polynomial min.poly.γ/K\mathrm{min. poly.}_{γ/K} in K[x]K[x] of smallest degree such that min.poly.γ/K(γ)=0\mathrm{min. poly.}_{γ/K} (γ) = 0, called the minimal polynomial for γγ over KK.

I like his general approach but I would really like it better if things like this were manifestly basis-independent ... I don't really know how to make a generic element of AA in a basis-independent way, when AA is a general associative algebra, though, much less one that will actually work in this kind of construction ...

view this post on Zulip John Baez (Sep 02 2023 at 09:47):

What's AA? A free FF-module? Is FF a field here?

view this post on Zulip James Deikun (Sep 02 2023 at 11:26):

AA is an FF-algebra, and FF is a field; he says most of this also works for FF a commutative ring, though quite some of his proofs would need to be revised.

view this post on Zulip James Deikun (Sep 02 2023 at 11:44):

I guess the interesting fact that should prove the existence of the characteristic polynomial is that the power maps k-^k on AA generate a finite-dimensional subspace of A[A]A[A] (the module of endomorphisms of AA as an affine scheme) as a module over the coordinate ring F[A]F[A] of AA when AA is finitely generated. But I'm not sure how to prove that; proving it by pigeonholing as above only works for individual points and then you have to do a bunch of other proofs to show the points act coherently together.

view this post on Zulip Patrick Nicodemus (Sep 02 2023 at 15:25):

I think that searching for basis-free proofs is great but I also think that at some point you will run up against the wall that if a theorem only holds for finite dimensional vector spaces or finitely generated modules you will inevitably run into a point in the proof that invokes that hypothesis in some way or another. Maybe you're looking for some kind of HoTT style proof where you assume the propositional truncation of "there exists a basis of length nn", I have no idea what the theory of such "finite dimensional" vector spaces is.

view this post on Zulip James Deikun (Sep 02 2023 at 16:55):

Hm, actually I guess A[A]A[A] is finitely generated over F[A]F[A] when AA is finitely generated over FF ... so the same sort of counting argument does work after all. Proving that the coefficients are actually polynomials and not rational functions seems more difficult though ... makes me wonder if it's even true for general algebras AA when, e.g., FF is a non-noetherian commutative ring ...

view this post on Zulip Jean-Baptiste Vienney (Sep 02 2023 at 18:38):

Patrick Nicodemus said:

I think that searching for basis-free proofs is great but I also think that at some point you will run up against the wall that if a theorem only holds for finite dimensional vector spaces or finitely generated modules you will inevitably run into a point in the proof that invokes that hypothesis in some way or another. Maybe you're looking for some kind of HoTT style proof where you assume the propositional truncation of "there exists a basis of length nn", I have no idea what the theory of such "finite dimensional" vector spaces is.

Maybe such a thing will be possible when we will have a categorical characterization of the category of finite-dimensional vector spaces. André Kornell has found Axioms for the category of Hilbert spaces together with Chris Heunen, and then Axioms for the category of sets and relations, which share some similarities with the category of finite-dimensional vector spaces, because they are both compact-closed categories enriched over commutative monoids. He did a talk last time at uOttawa and said that axioms for FHilbFHilb (which is a category equivalent to FVecFVec) look more difficult to find that for HilbHilb, which seems paradoxical. I wanted to tell him that I find it logical because the definition of a finite-dimensional vector space is longer than the definiton of a generic vector space and therefore a finite-dimensional vector space is something richer than a generic vector space, even if for instance, a non-degenerated *-autonomous category of vector spaces, ie. a good category of infinite-dimensional vector spaces is a category even richer and more complicated than FHilbFHilb. I didn't dare telling him this because I was to shy, so I say it here.

Now, would axioms for FVecFVec allow to reason without basis? I guess yes but I'm not sure of that.

view this post on Zulip Jean-Baptiste Vienney (Sep 02 2023 at 19:03):

Looking at the axioms for Hilbert spaces, I don't think that such axioms for FVecFVec would not speak about basis, because possessing "dagger biproducts" is one of these axioms , so it looks like biproducts would play a role. And biproducts are a way to talk about direct sums and thus also basis.

view this post on Zulip Jean-Baptiste Vienney (Sep 02 2023 at 19:04):

Anyway, I can't answer the question before these axioms are found.

view this post on Zulip Patrick Nicodemus (Sep 02 2023 at 21:57):

I like the characterization of finite-dimensional vector spaces as precisely the dualizable ones. Mike Shulman and Kate Ponto have a good paper on this
https://arxiv.org/abs/1107.6032
Although tbh my understanding stops at thinking of these as enabling you to do "feedback loops" or "fixed point" arguments, I don't understand how that works.

view this post on Zulip Patrick Nicodemus (Sep 02 2023 at 21:58):

This paper definitely helped me to understand why the trace is important.

view this post on Zulip Jean-Baptiste Vienney (Sep 02 2023 at 21:59):

Thanks, I'll look at this!

view this post on Zulip Patrick Nicodemus (Sep 02 2023 at 22:15):

When VV is finite-dimensional,the bilinear map VWHom(V,W)V^\ast\otimes W \to Hom(V,W) sending vwv^\ast\otimes w, λx.v(x)w\lambda x.v^\ast(x)w is an isomorphism. If you choose a basis e1,,ene_1,\dots, e_n for VV and use this to give a dual basis e1,,ene_1^\ast,\dots, e_n^\ast for VV^\ast, then for every T:VWT: V\to W, we can look at where TT sends the basis elements, and for any vector vv in VV, we can break vv into e1(v)e1++en(v)ene_1^\ast(v)e_1 + \dots + e_n^\ast(v)e_n, apply TT to the basis elements and recombine them on the other side.

Every vector space has a map VVRV\otimes V^\ast \to \mathbb{R} given by the evaluation map but only the finite dimensional vector spaces have a natural map RVV\mathbb{R}\to V\otimes V^{\ast}, which is given by using the above isomorphism Hom(V,V)VVHom(V,V)\cong V^\ast \otimes V and sending 1R1\in \mathbb{R} to the identity map 1V1_V. Formally you can express this by saying that VV^\ast is right adjoint to VV in the one-object bicategory arising from the monoidal structure on VecVec. The evaluation map VVRV^\ast\otimes V\to \mathbb{R} is the counit of the adjunction and the map I just described arising from the identity RVV\mathbb{R}\to V^\ast\otimes V is the unit.

view this post on Zulip Jean-Baptiste Vienney (Sep 03 2023 at 10:35):

I think this recent paper gives on of the best proofs of the Cayley-Hamilton theorem: Hasse–Schmidt Derivations and Cayley–Hamilton Theorem for Exterior Algebras. Given that I've been working on categoryfying the notion of Hasse-Schmidt derivation (through something that I call higher-order differential categories), it may be possible that someday I will be able to categorify this paper too which is about a graded-commutative version of Hasse-Schmidt derivations, and thus maybe find a coordinate-free proof of the Cayley-Hamilton theorem. But I haven't even written my work on categorifying Hasse-Schmidt derivations, so please don't ask me to talk about this. I just say that maybe in one year or two, I could provide a coordinate-free proof of the Cayley-Hamilton theorem.

view this post on Zulip John Baez (Sep 03 2023 at 20:12):

I'll check out that paper!

view this post on Zulip Jorge Soto-Andrade (Sep 03 2023 at 22:00):

Jean-Baptiste Vienney said:

Λn(λ.1Eu)(e1...en)\Lambda^{n}(\lambda.1_E-u)(e_1 \wedge ... \wedge e_n)
=σSncσ(1),...,σ(n).eσ(1)...eσ(n)=\underset{\sigma \in \mathfrak{S_n}}{\sum}c_{\sigma(1),...,\sigma(n)}.e_{\sigma(1)}\wedge ... \wedge e_{\sigma(n)}

I think that you are right in looking at this with Grassmann's eyes... Then you should get that the coefficients of the determinant, gleaned from the n-fold exterior product should be the intrinsic "symmetric" funtions of an endomorphism (or matrix if you prefer), to wit, determinant, trace and the interpolations between them, essentially trace of the k-th exterior power, is that right?
The baby challenge would be to give a direct intrinsic proof of the fact that. (*). A2Tr(A)A+det(A)Id=0A^2 - Tr(A)A + det(A)Id = 0 , where we got
Tr(a) Tr(a) as (the scale factor of the homothety) AId+IdA A \wedge Id + Id \wedge A and det(A) det(A) as the scale factor of the homothety AAA \wedge A. Of course, you can prove (*) by brute force...

view this post on Zulip Jorge Soto-Andrade (Sep 03 2023 at 22:14):

Todd Trimble said:

Jean-Baptiste Vienney said:

I prefer if there is no use of something like the Cramer's rule and if it follows from a direct computation by applying the definitions (that's probably a matter of taste). But I can't guarantee that it is not going to take me the whole day -- if it works.

With regard to Cramer's rule: it's an easy consequence of the fact that the determinant is an alternating multilinear map Vn=V××VkV^n = V \times \ldots \times V \to k. In other words, a direct computation by applying the definition of determinant.

But the only reason for mentioning it is that it leads directly to the adjugate matrix of a matrix AA, viz. a matrix A~\widetilde{A} such that AA~=A~A=det(A)IA \widetilde{A} = \widetilde{A} A = \det(A) \cdot I. The presence of such a matrix A~\widetilde{A} is the key thing used in the nLab to prove Cayley-Hamilton, which has a nice short conceptual proof (essentially the same proof as in Lang's Algebra, if I remember correctly).

I agree that the adjugate matrix (or endomorphism) of A is very helpful. Indeed, it is just the (n-1) exterior power of A...On the other hand, Cramer's are obvious if you look at your system of equations with Grassmann's eyes, taking advantage of the exterior product, if I remember well.

view this post on Zulip Todd Trimble (Sep 04 2023 at 02:42):

Jorge Soto-Andrade said:

Todd Trimble said:

Jean-Baptiste Vienney said:

I prefer if there is no use of something like the Cramer's rule and if it follows from a direct computation by applying the definitions (that's probably a matter of taste). But I can't guarantee that it is not going to take me the whole day -- if it works.

With regard to Cramer's rule: it's an easy consequence of the fact that the determinant is an alternating multilinear map Vn=V××VkV^n = V \times \ldots \times V \to k. In other words, a direct computation by applying the definition of determinant.

But the only reason for mentioning it is that it leads directly to the adjugate matrix of a matrix AA, viz. a matrix A~\widetilde{A} such that AA~=A~A=det(A)IA \widetilde{A} = \widetilde{A} A = \det(A) \cdot I. The presence of such a matrix A~\widetilde{A} is the key thing used in the nLab to prove Cayley-Hamilton, which has a nice short conceptual proof (essentially the same proof as in Lang's Algebra, if I remember correctly).

I agree that the adjugate matrix (or endomorphism) of A is very helpful. Indeed, it is just the (n-1) exterior power of A...On the other hand, Cramer's are obvious if you look at your system of equations with Grassmann's eyes, taking advantage of the exterior product, if I remember well.

This is an insightful comment, and fits together well with some other insightful comments (by Patrick Nicodemus, and by Aaron Fairbanks who produced those very helpful string diagrams), and here I am now catching up with all this.

Translating what Jorge is saying, the adjugate of an endomorphism A:VVA: V \to V for nn-dimensional VV is in fact the map A~:VV\tilde{A}: V \to V defined as the composite

V(Λn1V)(Λn1A)(Λn1V)VV \cong (\Lambda^{n-1} V)^\ast \overset{(\Lambda^{n-1} A)^\ast}{\to} (\Lambda^{n-1} V)^\ast \cong V

where the isomorphisms come from the perfect pairing :VΛn1VΛnVk\wedge: V \otimes \Lambda^{n-1} V \to \Lambda^n V \cong k. This definition of the adjugate was embodied in the string diagrams. By that definition, the desired equation A~A=det(A)IV\tilde{A} A = \det(A) \cdot I_V is equivalent to commutativity of a pentagon diagram one of whose legs is

VAV(Λn1V)(Λn1A)(Λn1V)V \overset{A}{\to} V \cong (\Lambda^{n-1} V)^\ast \overset{(\Lambda^{n-1} A)^\ast}{\to} (\Lambda^{n-1} V)^\ast

and the other of which is

VdetAIV(Λn1V).V \overset{\det A \cdot I}{\to} V \cong (\Lambda^{n-1} V)^\ast.

Commutativity of this pentagon unwinds, string-diagrammatically, to a naturality square for :VΛn1VΛnVk\wedge: V \otimes \Lambda^{n-1} V \to \Lambda^n V \cong k which says that the composite

VΛn1VΛnVΛnAΛnVV \otimes \Lambda^{n-1} V \overset{\wedge}{\to} \Lambda^n V \overset{\Lambda^n A}{\to} \Lambda^n V

equals the composite

VΛn1VAΛn1AVΛn1VΛnVV \otimes \Lambda^{n-1} V \overset{A \otimes \Lambda^{n-1} A}{\to} V \otimes \Lambda^{n-1} V \overset{\wedge}{\to} \Lambda^n V,

since ΛnA=detAI\Lambda^n A = \det A \cdot I by the conceptual definition of determinant.

view this post on Zulip James Deikun (Sep 04 2023 at 18:24):

Some interesting, and occasionally annoying, facts about commutative rings and their modules that I found out in the course of investigating all this:

view this post on Zulip James Deikun (Sep 04 2023 at 18:26):

I'd really like to find more connections between, and justifications for, these facts. I've seen proofs for them in at least some special cases, but they seem more like "coincidences of calculation" than proper justifications.

view this post on Zulip James Deikun (Sep 04 2023 at 20:35):

In particular, it's pretty surprising that the ideal is seemingly "almost always" generated by a single polynomial even when the ring is not a field; since a single-indeterminate polynomial ring being a principal ideal domain is equivalent to its coefficient ring being a field.

view this post on Zulip Aaron David Fairbanks (Sep 07 2023 at 21:02):

Here is a more basis-free version of the nLab proof of Cayley-Hamilton. (Were people able to follow that proof? I seemed to need to use the transpose of AA instead of AA.)

Let RR be a commutative ring, VV a free RR-module, and ff an endomorphism of VV.

Let Vx=fV_{x=f} denote the R[x]R[x]-module structure on VV induced by the unique RR-algebra map R[x]Hom(V,V)R[x] \to \text{Hom}(V, V) sending xx to ff. Let FUF ⊣ U be the free-forgetful adjunction between R-ModR\text{-Mod} and R[x]-ModR[x]\text{-Mod}. Let qq denote the canonical map F(V)Vx=fF(V) \to V_{x=f}, which is the component of the adjunction counit at Vx=fV_{x=f}.

By the definition of Vx=fV_{x=f}, we have
U(xid)=fU(x \cdot \text{id}) = f.

Naturality of the counit then yields
xq=F(f);qx \cdot q = F(f);q,
i.e.,
(xidF(f));q=0(x \cdot \text{id} - F(f));q = 0.

Hence,
det(xidF(f))q=adj(xidF(f));(xidF(f));q=0\det(x \cdot \text{id} - F(f)) \cdot q = \text{adj}(x \cdot \text{id} - F(f));(x \cdot \text{id} - F(f));q = 0.

view this post on Zulip John Baez (Sep 07 2023 at 21:50):

Wow, this looks elegant! I wish I understood it! I first run into trouble trying to understand what's U(xid)U(x \cdot \mathrm{id}). Maybe it should be obvious but what is this id\mathrm{id} the identity morphism of?

view this post on Zulip John Baez (Sep 07 2023 at 21:51):

I guess Vx=fV_{x = f}?

view this post on Zulip Aaron David Fairbanks (Sep 07 2023 at 21:52):

Yes, that's right. Sorry, I should have somehow annotated that.

view this post on Zulip John Baez (Sep 07 2023 at 21:55):

I think I get the proof now but it goes by so quickly it feels like I'm being fooled!

view this post on Zulip John Baez (Sep 07 2023 at 21:56):

Probably part of the problem is that it's late and I'm tired. But if this proof is valid, it could really be the proof from The Book for the Cayley-Hamilton theorem!

view this post on Zulip John Baez (Sep 07 2023 at 21:56):

(I never made it through the nLab proof.)

view this post on Zulip Aaron David Fairbanks (Sep 07 2023 at 21:57):

Wow, thanks! I appreciate the compliment. Let's hope we're not both being fooled, then.

view this post on Zulip John Baez (Sep 07 2023 at 22:07):

Other folks should check out this proof! Like @Todd Trimble.

view this post on Zulip Todd Trimble (Sep 08 2023 at 00:06):

John Baez said:

(I never made it through the nLab proof.)

Oh for heaven's sake.

view this post on Zulip James Deikun (Sep 08 2023 at 00:30):

If only I hadn't encountered the idea of determinants and characteristic polynomials for non-matrix algebras I would probably consider this proof satisfying enough to foreclose my entire interest in the subject. But now I'm stuck pondering the True Meaning of the Determinant in general, and in particular the True Meaning of the Norm of the Sedonions. If it doesn't annihilate the zero divisors what good is it? Well, it's good for producing the characteristic polynomial! But why?

view this post on Zulip John Baez (Sep 08 2023 at 09:30):

Todd Trimble said:

John Baez said:

(I never made it through the nLab proof.)

Oh for heaven's sake.

I'm doing lots of things. If I just glance at something out of random interest and it doesn't seem instantly delicious I'll switch to something else. Then there are things I'm actually working on, where I'm more persistent.

But sorry, I'd forgotten you were probably the one who wrote this proof. @Aaron David Fairbanks's proof might be a refashioning of the same proof - I don't know.

view this post on Zulip John Baez (Sep 08 2023 at 09:35):

I haven't found any interesting applications of sedenions, or any deep connections between them and other topics in math or physics. I suspect they might be good for something, but I haven't found it.

This is quite different than the octonions, which have rich connections to exceptional Lie groups, superstring theory, etc.

The Wikipedia article says

Sedenion neural networks provide[ further explanation needed] a means of efficient and compact expression in machine learning applications and have been used in solving multiple time-series and traffic forecasting problems.

but I suspect this is basically bullshit, despite the 2 references.

view this post on Zulip John Baez (Sep 08 2023 at 09:37):

So, right now I can't find "sedenion determinants" interesting.

view this post on Zulip James Deikun (Sep 08 2023 at 11:40):

I'm mostly interested in the sedenions here because they are an algebra where we both have a known expression for their "determinant" (the norm) that works for deriving the characteristic polynomial, and also it isn't a multiplicative homomorphism, making them a useful test case for figuring out what the true defining property of an algebra's determinant actually is.

view this post on Zulip John Baez (Sep 08 2023 at 11:49):

Okay, this is a decent reason for being interested in them. Having spent years studying the octonions I keep hoping the sedenions will actually be connected to some deep mathematics, but so far I've seen no evidence of that. David Corfield calls

reals, complexes, quaternions, octonions, sedenions, ...

a 'sunken island chain', where the first islands poke above the water but the later ones don't.

view this post on Zulip Todd Trimble (Sep 08 2023 at 16:16):

John Baez said:

Todd Trimble said:

John Baez said:

(I never made it through the nLab proof.)

Oh for heaven's sake.

I'm doing lots of things. If I just glance at something out of random interest and it doesn't seem instantly delicious I'll switch to something else. Then there are things I'm actually working on, where I'm more persistent.

But sorry, I'd forgotten you were probably the one who wrote this proof. Aaron David Fairbanks's proof might be a refashioning of the same proof - I don't know.

I like Aaron's fashioning quite a bit -- it's nice that it's basis-free, and it is Book-like, and I'm considering editing the nLab proof as a result. But I don't think the proofs are all that different. For reference, I'm talking about the proof of Cayley-Hamilton, Theorem 1.2, which is only a few lines long. The equation on display in that proof is very close to the last line Aaron gave. In both cases, the proofs assume the stuff about the adjugate matrix as established beforehand; if the nLab stuff didn't have an inviting aroma, it may be because of the time spent establishing this first (Lemma 1.1).

view this post on Zulip Jorge Soto-Andrade (Sep 12 2023 at 03:26):

Hi! Sorry for jumping in so late into this interesting discussion, but I just wanted to share a sort of divergent approach to Cayley-Hamilton I realized last week. First, I would rather begin from scratch, with a linear endomorphism AA of an n-dimensional vector space VV and the "natural" question: Can you find a "universal" polynomial equation, of degree not greater than nn, satisfied by AA? It seems to me that you can get this polynomial "bare handed" just by some elementary combinatorial Yoga if you look at the whole situation with "Grassmann eyes".
Indeed, notice that for the baby case n=2 n = 2 ( with I=IdVI = Id_V ), we have
A2I=(AA)(II)=(AI)(AI)=(AI+IA)(AI)(IA)(AI)=.(AI+IA)(AI)(AA)(II) A^2 \wedge I = (A \circ A) \wedge (I \circ I) = (A\wedge I) \circ (A\wedge I) = (A\wedge I + I\wedge A) \circ (A\wedge I) - (I\wedge A) \circ (A\wedge I) =. (A\wedge I + I\wedge A) \circ (A\wedge I) - (A\wedge A) \circ (I\wedge I)
so that
A2=(AI+IA)A(AA)I=S1(A)A+S2(A)I A^2 = (A\wedge I + I\wedge A) \circ A - (A \wedge A) \circ I = S^1(A)A + S^2(A) I
in terms of the Grassmann-symmetric functions Sk(A) S^k (A) of A A .
Extension to general n n is rather straightforward. E.g. for n=3 n = 3 , you begin with A3II A^3 \wedge I \wedge I and end up analogously with A3=S1(A)A2S2(A)A+S3(A)I A^3 = S^1(A)A^2 - S^2(A)A + S^3(A) I
Notice that here by AI A \wedge I and so on, we mean the "pseudo endomorphism " (= span) of 2V \bigwedge ^2 V given by the morphisms uvuv u\otimes v \mapsto u \wedge v and uvAuv u\otimes v \mapsto Au \wedge v .
So, we play around with pseudo endomorphisms but we end up with a polynomial involving bona fide endomorphisms of V V .

                       Tod. d Trimble|277611** [said](https://categorytheory.zulipchat.com/#narrow/stream/266967-general.3A-mathematics/topic/Nakayama's.20lemma/near/389911724):

John Baez said:

Todd Trimble said:

John Baez said:

(I never made it through the nLab proof.)

Oh for heaven's sake.

I'm doing lots of things. If I just glance at something out of random interest and it doesn't seem instantly delicious I'll switch to something else. Then there are things I'm actually working on, where I'm more persistent.

But sorry, I'd forgotten you were probably the one who wrote this proof. Aaron David Fairbanks's proof might be a refashioning of the same proof - I don't know.

I like Aaron's fashioning quite a bit -- it's nice that it's basis-free, and it is Book-like, and I'm considering editing the nLab proof as a result. But I don't think the proofs are all that different. For reference, I'm talking about the proof of Cayley-Hamilton, Theorem 1.2, which is only a few lines long. The equation on display in that proof is very close to the last lin e Aaron gave. In both cases, the proofs assume the stuff about the adjugate matrix as established beforehand; if the nLab stuff didn't have an inviting aroma, it may be because of the time spent establishing this first (Lemma 1.1).



view this post on Zulip John Baez (Sep 12 2023 at 08:57):

This looks really neat, but I guess I don't understand the general theory of these "pseudo endomorphisms" from your one example. What sort of maps A,BA, B am I allowed to use to form a pseudo endomorphism ABA \wedge B, and how is it defined?

view this post on Zulip Jorge Soto-Andrade (Sep 13 2023 at 00:41):

John Baez said:

This looks really neat, but I guess I don't understand the general theory of these "pseudo endomorphisms" from your one example. What sort of maps A,BA, B am I allowed to use to form a pseudo endomorphism ABA \wedge B, and how is it defined?

Sorry, a bit more generally, for linear endomorphisms A A and B B from V V to V V the idea is to define AB A \wedge B as the span (pseudo morphism) from VV V \wedge V to itself consisting of the two linear maps from VV V \otimes V to VV V \wedge V given by uvuv u \otimes v \mapsto u \wedge v and uvAuBv u \otimes v \mapsto Au \wedge Bv resp. In the approach I suggest, you only need the case A=IdV A = Id_V .
You could define this more generally for any pair of linear maps A,B A, B , an also for n-tuples A1,...,An A_1, ..., A_n.
Of course, the naive definition (AB)(uv)=AuBv( A \wedge B) (u \wedge v) = Au \wedge Bv does not work ( to fix that, some folks define
(AB)(uv)=AuBv+BuAv  (=AuBvAvBu (A \wedge B) (u \wedge v) = Au \wedge Bv + Bu \wedge Av \; ( = Au \wedge Bv - Av \wedge Bu ).

view this post on Zulip James Deikun (Sep 13 2023 at 00:49):

You seem to compose pseudo endomorphisms too, how does that work? The obvious thing would be the usual composition by pullback, but how does it look in Vect or wherever you're working?

view this post on Zulip Jorge Soto-Andrade (Sep 13 2023 at 01:00):

James Deikun said:

You seem to compose pseudo endomorphisms too, how does that work? The obvious thing would be the usual composition by pullback, but how does it look in Vect or wherever you're working?

Yes, it looks like that, but more precisely, with the above notations, (AB)(AB)=(AA)(BB) (A' \wedge B') \circ (A \wedge B) = (A' \circ A) \wedge (B' \circ B) which is the pseudo endomorphism (span) defined by the maps uvuv u \otimes v \mapsto u\wedge v and uv(AA)u(BB)v u \otimes v \mapsto (A' \circ A)u\wedge (B' \circ B)v . A more general categorical setting is for sure possible ...
``` \wedge should be functorial on pairs of morphisms (A,B) (A, B)
I would guess that for this to work you just need a \otimes in your category which projects nicely onto the antisymmetric \wedge product. A key point though is that in Vect, for an n-dimensional vector space V V , the n-fold exterior product nV \bigwedge^n V is isomorphic to the base field. A sort of "Grassmann - like" symmetric monoidal category...

view this post on Zulip John Baez (Sep 13 2023 at 08:37):

Neat! By the way, the quote in your last comment has an error, so your reply to James is part of your quote of him. Your end-quote symbol ``` needs a blank line after it.

view this post on Zulip ʇɐ (Sep 16 2023 at 14:17):

Patrick Nicodemus zei:

When VV is finite-dimensional,the bilinear map VWHom(V,W)V^\ast\otimes W \to Hom(V,W) sending vwv^\ast\otimes w, λx.v(x)w\lambda x.v^\ast(x)w is an isomorphism. If you choose a basis e1,,ene_1,\dots, e_n for VV and use this to give a dual basis e1,,ene_1^\ast,\dots, e_n^\ast for VV^\ast, then for every T:VWT: V\to W, we can look at where TT sends the basis elements, and for any vector vv in VV, we can break vv into e1(v)e1++en(v)ene_1^\ast(v)e_1 + \dots + e_n^\ast(v)e_n, apply TT to the basis elements and recombine them on the other side.

Every vector space has a map VVRV\otimes V^\ast \to \mathbb{R} given by the evaluation map but only the finite dimensional vector spaces have a natural map RVV\mathbb{R}\to V\otimes V^{\ast}, which is given by using the above isomorphism Hom(V,V)VVHom(V,V)\cong V^\ast \otimes V and sending 1R1\in \mathbb{R} to the identity map 1V1_V. Formally you can express this by saying that VV^\ast is right adjoint to VV in the one-object bicategory arising from the monoidal structure on VecVec. The evaluation map VVRV^\ast\otimes V\to \mathbb{R} is the counit of the adjunction and the map I just described arising from the identity RVV\mathbb{R}\to V^\ast\otimes V is the unit.

Incidentally, this shows that Garibaldi’s generic element is uniquely determined: given an ϵ \epsilon that occurs as the counit for an adjunction, the corresponding unit η \eta is unique, so if ϵ ⁣:VVk \epsilon \colon V^* \otimes V \to k is the usual evaluation then Garibaldi’s γ=η(1)Vk[V] \gamma = \eta(1) \in V \otimes k[V^*] is independent of the chosen basis.

view this post on Zulip Jorge Soto-Andrade (Sep 16 2023 at 17:18):

John Baez said:

Neat! By the way, the quote in your last comment has an error, so your reply to James is part of your quote of him. Your end-quote symbol ``` needs a blank line after it.

Thanks for pointing this out.
By the way, I checked that the "natural" composition of my pseudo-endomorphisms coincide with the general composition of spans, via pullback (the pullback boiling down naturally to VV V \otimes V endowed with the "obvious" linear maps to VV V \wedge V . Notice however that we want not just any span from VV V \wedge V to itself as a result, but one which is an "avatar" of VV V \otimes V with left arrow being the "canonical" uvuv u \otimes v \mapsto u \wedge v map (recall that "avatar" means "descent" ethymologically :blush: )