Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: deprecated: mathematics

Topic: characters of finite group representations


view this post on Zulip John Baez (Sep 03 2023 at 09:45):

@Todd Trimble recently told me a slick proof of a cool fact he learned from Charles Rezk on MathOverflow.

First a warmup fact. Suppose ρ is a representation of a group G on a finite-dimensional complex vector space. If g ∈ G then tr(ρ(g)) is a finite sum of nth roots of unity, where n is the order of g: that is, the smallest integer n > 0 with 𝑔ⁿ = 1.

Why? Well, we have ρ(g)ⁿ = 1. You can always diagonalize a complex matrix whose nth power is 1. (See why?) So do that. Then its diagonal entries have to be nth roots of unity. So tr(ρ(g)) is a sum of nth roots of unity.

Now for the cool part. If G is a symmetric group, then tr(ρ(g)) is an integer!

This is harder, but still quick. Let F be the field consisting of all rational linear combinations of nth roots of unity. If k is relatively prime to n there's an automorphism of F sending each nth root of unity to its kth power. Applying this automorphism to the entries of our diagonalized matrix ρ(g) we get the matrix ρ(g)ᵏ.

But for a permutation with gⁿ = 1, gᵏ has the same cycle structure as g whenever k is relatively prime to n. (Draw some examples). So gᵏ is conjugate to g. So ρ(g)ᵏ is conjugate to ρ(g). But the trace of a matrix doesn't change when you conjugate that matrix by something, so tr(ρ(g)ᵏ) = tr(ρ(g)).

We've learned that tr(ρ(g)) is a sum of nth roots of unity, but it also doesn't change when we apply the automorphism of F sending any nth root of unity to its kth power, whenever k is relatively prime to n. By some Galois theory, this forces tr(ρ(g)) to be an integer!

(Charles Rezk said this much faster since he assumed you know all the background. I've tried to minimize the amount of jargon. But we still need some Galois theory at the end - basically, the only algebraic integers in F invariant under the Galois group of F over ℚ are ordinary integers. Maybe there's some easy way to see this fact in the special case at hand.)

view this post on Zulip David Egolf (Sep 03 2023 at 15:58):

John Baez said:

You can always diagonalize a complex matrix whose nth power is 1. (See why?)

I don't see why, but I would like to! I tried to figure this out below, but I got stuck.

The minimal polynomial of a matrix AA is the polynomial mm of least degree so that m(A)=0m(A)=0. I guess that if nn is the smallest power such that An=IA^n=I, then the minimal polynomial of AA is m=xn1m=x^n-1, so that m(A)=AnI=0m(A) = A^n-I = 0. I'm not sure how I could prove this, though.

To show that AA is diagonalizable, I will try to show that AA has no repeated eigenvalues. I read on Wikipedia that the roots of the minimal polynomial mm are exactly the eigenvalues of AA. In our case the minimal polynomial mm has nn distinct roots, namely the nthn_{th} roots of 1 in C\mathbb{C}. So, AA has at least nn distinct eigenvalues. If we knew that AA was an n×nn \times n matrix, then it could have at most nn distinct eigenvalues, and so we could conclude that all of its eigenvalues must be distinct, and so it must be diagonalizable.

Can we show that a square matrix AA must be an n×nn \times n matrix if nn is the smallest positive integer so An=IA^n=I? No - this is false. The 1x1 matrix 1-1 satisfies (1)2=1(-1)^2=1, but (1)11(-1)^1 \neq 1. So, it's back to the drawing board!

view this post on Zulip Jorge Soto-Andrade (Sep 03 2023 at 16:42):

David Egolf said:

John Baez said:

You can always diagonalize a complex matrix whose nth power is 1. (See why?)

I don't see why, but I would like to! I tried to figure this out below, but I got stuck.

The minimal polynomial of a matrix AA is the polynomial mm of least degree so that m(A)=0m(A)=0. I guess that if nn is the smallest power such that An=IA^n=I, then the minimal polynomial of AA is m=xn1m=x^n-1, so that m(A)=AnI=0m(A) = A^n-I = 0. I'm not sure how I could prove this, though.

To show that AA is diagonalizable, I will try to show that AA has no repeated eigenvalues. I read on Wikipedia that the roots of the minimal polynomial mm are exactly the eigenvalues of AA. In our case the minimal polynomial mm has nn distinct roots, namely the nthn_{th} roots of 1 in C\mathbb{C}. So, AA has at least nn distinct eigenvalues. If we knew that AA was an n×nn \times n matrix, then it could have at most nn distinct eigenvalues, and so we could conclude that all of its eigenvalues must be distinct, and so it must be diagonalizable.

Can we show that a square matrix AA must be an n×nn \times n matrix if nn is the smallest positive integer so An=IA^n=I? No - this is false. The 1x1 matrix 1-1 satisfies (1)2=1(-1)^2=1, but (1)11(-1)^1 \neq 1. So, it's back to the drawing board!

Simpler, I think, is to notice that if An=IA^n=I then you can explicitly describe the (full family of) "eigen-projectors" PωP_\omega, where ω\omega is an n-th root of unity as. Pω=0kn1ωˉkAkP_\omega =\sum_{0\leq k \leq n-1} \bar \omega^kA^k To do spectral theory is better to work in "operator algebra". To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A. If you have a minimal polynomial equation satisfied by A you can explicitly describe those proyectors. Think of the baby example. A2=IA^2 = I. and. then A3=Id A^3 = Id.

view this post on Zulip David Egolf (Sep 03 2023 at 17:46):

Thanks @Jorge Soto-Andrade for your explanation! Unfortunately, I am not following very well.

I'd especially appreciate any elaboration/explanation anyone could offer on the quote below:

Jorge Soto-Andrade said:

To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A.

For example, if we have managed to diagonalize AA as A=PDP1A=PDP^{-1} for a diagonal matrix DD, how does this diagonalization induce a decomposition of the identity into eigen projectors?

(References to a textbook that talks about this would also be welcome!)

view this post on Zulip Jorge Soto-Andrade (Sep 03 2023 at 18:51):

David Egolf said:

Thanks Jorge Soto-Andrade for your explanation! Unfortunately, I am not following very well.

I'd especially appreciate any elaboration/explanation anyone could offer on the quote below:

Jorge Soto-Andrade said:

To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A.

For example, if we have managed to diagonalize AA as A=PDP1A=PDP^{-1} for a diagonal matrix DD, how does this diagonalization induce a decomposition of the identity into eigen projectors?

(References to a textbook that talks about this would also be welcome!)
I am afraid that this way of fathoming diagonalisation is not the best one for "everyday life" what is really useful is to be able to find an eigen basis for A (an endomorphism of V), i.e. a basis of our vector space V, consisting of eigenvectors of A. Then you can easily figure out what happens with the powers of A for example (a system theoretical motivation for linear algebra). Then you have the eigen spaces $$V_\lambda$$ consisting of all eigenvectors associated to the same eigenvalue $$\lambda$$. They provide a direct sum decomposition of V. This is a "better" definition of diagonalisation I claim. Now, you can climb to the next floor and look at the operators (endomorphisms) of V (the Ruskis do this very well in their textbooks, better than American textbooks I am afraid - i am looking for some nice reference in English though ). So, you look at Projectors PλP_\lambda of VV onto VλV_\lambda ALONG the other eigenspaces VμV_\mu. This means that PλP_\lambda is the identity on VλV_\lambda and equal to 0 on VμV_\mu for
μλ \mu \neq \lambda. Then you have that the sum of all eigenprojectors add up to the Identity of VV and so on. So, diagonalise A means to decompose the Identity operator of V in a way A would love... Maybe you should work this out (in matrix form if you like) for the baby example A2=IA^2 = I , in dimension 2, say.

view this post on Zulip Jorge Soto-Andrade (Sep 03 2023 at 19:18):

Jorge Soto-Andrade said:

David Egolf said:

Thanks Jorge Soto-Andrade for your explanation! Unfortunately, I am not following very well.

I'd especially appreciate any elaboration/explanation anyone could offer on the quote below:

Jorge Soto-Andrade said:

To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A.

For example, if we have managed to diagonalize AA as A=PDP1A=PDP^{-1} for a diagonal matrix DD, how does this diagonalization induce a decomposition of the identity into eigen projectors?

(References to a textbook that talks about this would also be welcome!)
I am afraid that this way of fathoming diagonalisation is not the best one for "everyday life" what is really useful is to be able to find an eigen basis for A (an endomorphism of V), i.e. a basis of our vector space V, consisting of eigenvectors of A. Then you can easily figure out what happens with the powers of A for example (a system theoretical motivation for linear algebra). Then you have the eigen spaces $$V_\lambda$$ consisting of all eigenvectors associated to the same eigenvalue $$\lambda$$. They provide a direct sum decomposition of V. This is a "better" definition of diagonalisation I claim. Now, you can climb to the next floor and look at the operators (endomorphisms) of V (the Ruskis do this very well in their textbooks, better than American textbooks I am afraid - i am looking for some nice reference in English though ). So, you look at Projectors PλP_\lambda of VV onto VλV_\lambda ALONG the other eigenspaces VμV_\mu. This means that PλP_\lambda is the identity on VλV_\lambda and equal to 0 on VμV_\mu for
μλ \mu \neq \lambda. Then you have that the sum of all eigenprojectors add up to the Identity of VV and so on. So, diagonalise A means to decompose the Identity operator of V in a way A would love... Maybe you should work this out (in matrix form if you like) for the baby example A2=IA^2 = I , in dimension 2, say.

You may check “Linear algebra done right” by Axler:
linear.axler.net  also available at the Ruski site gen.lib.rus.ec.    His viewpoint is close to ours

view this post on Zulip David Egolf (Sep 03 2023 at 19:26):

Awesome, that looks really helpful @Jorge Soto-Andrade ! It will take me a little time to work through this and hopefully understand it.

view this post on Zulip James Deikun (Sep 03 2023 at 19:49):

David Egolf said:

For example, if we have managed to diagonalize AA as A=PDP1A=PDP^{-1} for a diagonal matrix DD, how does this diagonalization induce a decomposition of the identity into eigen projectors?

An nn by nn diagonal matrix DD has nn eigen projectors pkp_k which each has a 1 on the diagonal at k,kk,k and the rest of the matrix 0. Then you take PpkP1Pp_{k}P^{-1} as eigen projectors for AA.

view this post on Zulip James Deikun (Sep 03 2023 at 19:54):

It doesn't work out this nicely for infinite matrices. The projectors arise from

view this post on Zulip Jorge Soto-Andrade (Sep 03 2023 at 20:31):

James Deikun said:

It doesn't work out this nicely for infinite matrices. The projectors arise from

Yes, but then you go over to the spectral measure and get an integral instead of the discrete direct sum decomposition of V into eigenspaces.

view this post on Zulip Jorge Soto-Andrade (Sep 03 2023 at 20:39):

John Baez said:

Todd Trimble recently told me a slick proof of a cool fact he learned from Charles Rezk on MathOverflow.

First a warmup fact. Suppose ρ is a representation of a group G on a finite-dimensional complex vector space. If g ∈ G then tr(ρ(g)) is a finite sum of nth roots of unity, where n is the order of g: that is, the smallest integer n > 0 with 𝑔ⁿ = 1.

Why? Well, we have ρ(g)ⁿ = 1. You can always diagonalize a complex matrix whose nth power is 1. (See why?) So do that. Then its diagonal entries have to be nth roots of unity. So tr(ρ(g)) is a sum of nth roots of unity.

Now for the cool part. If G is a symmetric group, then tr(ρ(g)) is an integer!

This is harder, but still quick. Let F be the field consisting of all rational linear combinations of nth roots of unity. If k is relatively prime to n there's an automorphism of F sending each nth root of unity to its kth power. Applying this automorphism to the entries of our diagonalized matrix ρ(g) we get the matrix ρ(g)ᵏ.

But for a permutation with gⁿ = 1, gᵏ has the same cycle structure as g whenever k is relatively prime to n. (Draw some examples). So gᵏ is conjugate to g. So ρ(g)ᵏ is conjugate to ρ(g). But the trace of a matrix doesn't change when you conjugate that matrix by something, so tr(ρ(g)ᵏ) = tr(ρ(g)).

We've learned that tr(ρ(g)) is a sum of nth roots of unity, but it also doesn't change when we apply the automorphism of F sending any nth root of unity to its kth power, whenever k is relatively prime to n. By some Galois theory, this forces tr(ρ(g)) to be an integer!

(Charles Rezk said this much faster since he assumed you know all the background. I've tried to minimize the amount of jargon. But we still need some Galois theory at the end - basically, the only algebraic integers in F invariant under the Galois group of F over ℚ are ordinary integers. Maybe there's some easy way to see this fact in the special case at hand.)

Nice remark. The key point seems to be that if you Galois conjugate a linear representation of the symmetric group, you get an isomorphic representation. I would not resist the temptation to claim that a "deep" proof of this is that follows from the well known fact that the symmetric group SnS_n is $PGL(n, \mathbb F_1)$$ the general linear group over "the field with one element" (this is the famous "Boutade de Tits"). The irreducible representations of PGL(n,Fq)PGL(n, \mathbb F_q) split into series parameterized by the multiplicative characters of the base finite field and its extensions. When you Galois conjugate, you change in general the parameter value. But for the field with one element all extensions collapse into the base field ("because" every power of 1 is 1) and you have no parameters.

view this post on Zulip Jorge Soto-Andrade (Sep 03 2023 at 21:37):

Jorge Soto-Andrade said:

John Baez said:

Todd Trimble recently told me a slick proof of a cool fact he learned from Charles Rezk on MathOverflow.

First a warmup fact. Suppose ρ is a representation of a group G on a finite-dimensional complex vector space. If g ∈ G then tr(ρ(g)) is a finite sum of nth roots of unity, where n is the order of g: that is, the smallest integer n > 0 with 𝑔ⁿ = 1.

Why? Well, we have ρ(g)ⁿ = 1. You can always diagonalize a complex matrix whose nth power is 1. (See why?) So do that. Then its diagonal entries have to be nth roots of unity. So tr(ρ(g)) is a sum of nth roots of unity.

Now for the cool part. If G is a symmetric group, then tr(ρ(g)) is an integer!

This is harder, but still quick. Let F be the field consisting of all rational linear combinations of nth roots of unity. If k is relatively prime to n there's an automorphism of F sending each nth root of unity to its kth power. Applying this automorphism to the entries of our diagonalized matrix ρ(g) we get the matrix ρ(g)ᵏ.

But for a permutation with gⁿ = 1, gᵏ has the same cycle structure as g whenever k is relatively prime to n. (Draw some examples). So gᵏ is conjugate to g. So ρ(g)ᵏ is conjugate to ρ(g). But the trace of a matrix doesn't change when you conjugate that matrix by something, so tr(ρ(g)ᵏ) = tr(ρ(g)).

We've learned that tr(ρ(g)) is a sum of nth roots of unity, but it also doesn't change when we apply the automorphism of F sending any nth root of unity to its kth power, whenever k is relatively prime to n. By some Galois theory, this forces tr(ρ(g)) to be an integer!

(Charles Rezk said this much faster since he assumed you know all the background. I've tried to minimize the amount of jargon. But we still need some Galois theory at the end - basically, the only algebraic integers in F invariant under the Galois group of F over ℚ are ordinary integers. Maybe there's some easy way to see this fact in the special case at hand.)

Nice remark. The key point seems to be that if you Galois conjugate a linear representation of the symmetric group, you get an isomorphic representation, so that its character is a Galois invariant algebraic integer, so a rational integer. I would not resist the temptation to claim that a "deep" proof of this is that follows from the well known fact that the symmetric group SnS_n is $PGL(n, \mathbb F_1)$$ the general linear group over "the field with one element" (this is the famous "Boutade de Tits"). The irreducible representations of PGL(n,Fq)PGL(n, \mathbb F_q) split into series parameterized by the multiplicative characters of the base finite field and its extensions. When you Galois conjugate, you change in general the parameter value. But for the field with one element all extensions collapse into the base field ("because" every power of 1 is 1) and you have no parameters.

In a less metaphorical vein, the same Galois argument applies to show that the Gelfand character of any finite group GG is integer valued (the Gelfand character is the sum of all irreducible characters of GG, each one exactly once). Now, If you calculate the Gelfand character for the usual groups you always find non negative integer values. I conjectured for a while that this was always true (the Gelfand character looks very much like a permutation representation associated to a transitive $G$-space, for example its mean value is 1) until I stated this conjecture during a talk in Tokyo and the day after Yokonuma came to see me to apologise (with a deep bow) saying that last night he had calculated the Gelfand Character for the Mathieu group M11 and found negative values, so my conjecture was disproved... Gelfand himself told me later that he had made the same conjecture and asked his students to calculate the Gelfand character of every finite group in sight but that they were too lazy to do that ...
There are still several open questions related to Gelfand models. Gelfand's intuition was that they were representations with "very special properties". Notice that for G=SO(3,R) G = SO(3, \mathbb R) a Gelfand model is provided by the natural unitary representation of GG in the Hilbert space L2(S2)L^2(S^2) whose decomposition into irreducibles involves the famous Legendre polynomials (as the kernels of the eigenprojectors as integral operators, in the sense of functional analysis). Here by eigenprojectors I mean the projectors associated to the decomposition of L2(S2)L^2(S^2) into a Hilbert direct sum of irreducibles (which are for sure eigenprojectors for some Laplacian-like intertwining operator).

view this post on Zulip John Baez (Sep 04 2023 at 08:20):

David Egolf said:

John Baez said:

You can always diagonalize a complex matrix whose nth power is 1. (See why?)

I don't see why, but I would like to! I tried to figure this out below, but I got stuck.

I see two main ways, the "hand-crafted way" where you develop an argument specifically for this problem, or the "mechanical" way where you use the technology developed to solve many similar problems.

view this post on Zulip John Baez (Sep 04 2023 at 08:22):

You're trying to diagonalize a complex matrix here. You can't diagonalize every complex matrix. How close can you come? This is what the Jordan normal form tells you!

view this post on Zulip John Baez (Sep 04 2023 at 08:24):

For any linear transformation of a finite-dimensional vector space, you can find a basis such that the transformation is described by a block diagonal matrix of this form:

view this post on Zulip John Baez (Sep 04 2023 at 08:26):

It has an arbitrary number of blocks of arbitrary size; each block has some fixed number λi\lambda_i running down the diagonal and 1's directly above the diagonal. All the other entries are zero.

view this post on Zulip John Baez (Sep 04 2023 at 08:27):

The numbers λi\lambda_i are the eigenvalues of your transformation, and there's one eigenvector (up to scalar multiples) for each block.

(If two or more blocks have equal λi\lambda_i's you get more eigenvectors by taking linear combinations of the eigenvectors for these blocks.)

view this post on Zulip John Baez (Sep 04 2023 at 08:29):

This result is a power tool for studying linear transformations of a finite-dimensional complex vector space! So if someone tells you "I have a complex matrix whose nth power is 1", you can change basis to put it in Jordan normal form, and see what it must be like for its nth power to equal 1.

view this post on Zulip John Baez (Sep 04 2023 at 08:37):

Being a lazy bum I reached for this "mechanical" method rather than trying to develop a hand-crafted method.

view this post on Zulip David Egolf (Sep 04 2023 at 16:04):

Thanks for explaining, @John Baez ! I've seen the Jordan normal form before, but have never been comfortable with it. So it's great to have an example where it's useful for something I'm trying to understand.

Let me see how this works. We have a linear map going from a complex finite-dimensional vector space VV to itself, ρ(g):VV\rho(g): V \to V, where ρ(g)n=1V\rho(g)^n = 1_V. There exists a basis for VV so that ρ(g)\rho(g) can be described by a matrix AA in Jordan normal form. In that basis, for any integer m>0m>0, the linear map ρ(g)m\rho(g)^m is described by the matrix AmA^m.

I am guessing that if AA has any off-diagonal 1's, then AmA^m will have off-diagonal non-zero entries as well, for all m>0m > 0. I started trying to prove this, but it was trickier than I expected! To prove this though, I suspect it is important that ρ(g)\rho(g) must be invertible, so that AA has no zero entries on its diagonal. (And since ρ(g)n=1V\rho(g)^n=1_V, ρ(g)\rho(g) must be invertible).

If we assume this, however, then if Am=IA^m=I, which has no off-diagonal non-zero entries, then we conclude that the Jordan normal form AA for ρ(g)\rho(g) must be diagonal. So, there exists a basis for ρ(g)\rho(g) in which the matrix describing ρ(g)\rho(g) is diagonal, which is what we wanted to show.

EDIT:
I think I understand why if the Jordan normal form matrix AA has any off-diagonal 1's, then AmA^m for m>0m>0 will have off-diagonal non-zero entries too. If AA has an off-diagonal 1, then for some basis vectors bib_i and bi1b_{i-1} (in the basis that generates the Jordan normal form) we have ρ(g)(bi)=λibi+1bi1\rho(g)(b_i) = \lambda_i b_i + 1 b_{i-1}. If we keep applying ρ(g)\rho(g) to this result, we never get to map the bi1b_{i-1} to zero (because ρ(g)\rho(g) is invertible) and we never get to map the bi1b_{i-1} back into some amount of bib_i (by the structure of the Jordan normal form matrix).

At least that's my intuition. Looking at this again, I think there is some more work to do to upgrade this to a proof.

view this post on Zulip John Baez (Sep 04 2023 at 19:42):

David Egolf said:

Thanks for explaining, John Baez ! I've seen the Jordan normal form before, but have never been comfortable with it. So it's great to have an example where it's useful for something I'm trying to understand.

Yes, I think you need to use it to appreciate it. Jordan normal form makes you the "king of linear transformations" - or at least king of any one linear transformation, since it lets you quickly settle many questions about a single linear transformation TT and polynomials anTn++a1T+a0a_n T^n + \cdots + a_1 T + a_0.

I am guessing that if AA has any off-diagonal 1's, then AmA^m will have off-diagonal non-zero entries as well, for all m>0m > 0.

That's true if also AA has enough nonzero on-diagonal entries. Otherwise we get counterexamples like

A=(0100) A = \left( \begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array} \right)

which has A2=0A^2 = 0.

But anyway, yes: this is the key to solving my puzzle, if you want to do it with Jordan normal forms.

I started trying to prove this, but it was trickier than I expected! To prove this though, I suspect it is important that ρ(g)\rho(g) must be invertible, so that AA has no zero entries on its diagonal.

Good! But that's more than you need. It's easiest to say what's going on for a matrix AA in Jordan normal form that has just a single block. If its diagonal entries are nonzero and it has at least one nonzero off-diagonal entry, all its powers also have nonzero off-diagonal entries... so none of its powers can equal 1.

But to get An=1A^n = 1, the diagonal entries of AA can't be zero.

So, to get An=1A^n = 1, the off-diagonal entries of AA must be zero.

So, any matrix whose nth power is 11 for some nn, AA must be diagonalizable!

view this post on Zulip John Baez (Sep 04 2023 at 19:49):

David Egolf said:

At least that's my intuition. Looking at this again, I think there is some more work to do to upgrade this to a proof.

The most important thing is to get some intuition for how to use Jordan normal form to do stuff... you can always rigorize your argument later.

view this post on Zulip John Baez (Sep 04 2023 at 19:50):

But here are some extra tips! First, it's good to think about a single Jordan block since when you apply any polynomial to a matrix in Jordan normal form you just apply it to each block: the different blocks don't talk to each other.

view this post on Zulip John Baez (Sep 04 2023 at 19:51):

Second, we can always write a matrix that's a single Jordan block as

λI+N \lambda I + N

where II is the identity matrix and NN is a matrix with 1's right above the diagonal.

view this post on Zulip John Baez (Sep 04 2023 at 19:55):

The matrix NN is what we call nilpotent: if we raise it to a high enough power we get zero!

It's good to take a matrix like this and square it, cube it, etc. to get an intuition for how this works, and when you get Nn=0N^n = 0. This matrix is very important, since nilpotent transformations are important, and any nilpotent matrix has a Jordan form where all the Jordan blocks are of the form NN. (Of course this matrix I'm calling NN comes in various sizes.)

view this post on Zulip John Baez (Sep 04 2023 at 19:57):

I guess I need to say: NN has all its nonzero entries one above the main diagonal, N2N^2 has all its nonzero entries two above the main diagonal, and so on.

view this post on Zulip John Baez (Sep 04 2023 at 19:59):

So, what happens if we take a power of

A=λI+N A = \lambda I + N?

We get things like

A2=λ2+2λN+N2 A^2 = \lambda^2 + 2 \lambda N + N^2

A3=λ3+3λ2N+3λN+N3 A^3 = \lambda^3 + 3 \lambda^2 N + 3 \lambda N + N^3

view this post on Zulip John Baez (Sep 04 2023 at 20:00):

The matrix NkN^k has all its entries kk above the main diagonal, so the different terms in these sums can't cancel each other!

view this post on Zulip John Baez (Sep 04 2023 at 20:01):

So, if λ0\lambda \ne 0, all the powers of AA will have nonzero entries off the main diagonal.

view this post on Zulip John Baez (Sep 04 2023 at 20:03):

This is a way of proving this claim:

It's easiest to say what's going on for a matrix AA in Jordan normal form that has just a single block. If its diagonal entries are nonzero and it has at least one nonzero off-diagonal entry, all its powers also have nonzero off-diagonal entries... so none of its powers can equal 1.

So, any matrix whose nth power is 11 for some nn, AA must be diagonalizable!

view this post on Zulip David Egolf (Sep 04 2023 at 21:59):

That is very cool! It seems handy to know that different Jordan blocks in a matrix AA don't interact with each other when we multiply AA by itself or add AA to multiples of itself. Given this, it makes a lot of sense to study how a single Jordan block interacts with itself under matrix multiplication.

I wrote out some powers of NN for various sizes of NN. It looks like each multiplication of NN shifts all of its columns one to the right, bringing in a zero column on the left. This shift moves all the 1's one away from the main diagonal, and so we'd expect NmN^m to have 1's mm above the main diagonal.

(By the way, I think A3=λ3I+3λ2N+3λN2+N3A^3 = \lambda^3 I + 3 \lambda^2 N + 3 \lambda N^2 + N^3).

Using the binomial expansion theorem for An=((λI)+N)n=k=0n(nk)(λI)nkNkA^n = ((\lambda I) + N)^n = \sum_{k = 0}^n \binom{n}{k} (\lambda I)^{n-k} N^k, we see that (for n>0n>0) this sum always contains a k=1k=1 term, (n1)(λI)n1N1=nλn1N\binom{n}{1} (\lambda I)^{n-1} N^1 = n \lambda^{n-1} N which has nonzero entries one above the main diagonal. And as you pointed out, all the other terms in this sum have nonzero entries at different "heights" above the main diagonal, so these nonzero off-the-main-diagonal entries in the k=1k=1 term can't cancel out with the other terms. So, when A=λI+NA = \lambda I + N (for λ0\lambda \neq 0), AnA^n really does have nonzero entries off the main diagonal, for each n>0n>0.

Now, assume we have a matrix AA in Jordan normal form, but with potentially more than one Jordan block. I will assume that different blocks do not interact with eachother under matrix multiplication or matrix addition. To take advantage of this, I write AA as A=A1+A2++AmA = A_1 + A_2 + \dots + A_m, where each matrix AiA_i contains the ithi_{th} Jordan block of AA and is otherwise zero. Then I am assuming that A2=(A1+A2+Am)(A1+A2++Am)=A12+A22++Am2A^2 = (A_1 + A_2 + \dots A_m)(A_1 + A_2 + \dots + A_m) = A_1^2 + A_2^2 + \dots + A_m^2 and in general An=A1n+A2n++AmnA^n =A_1^n + A_2^n + \dots + A_m^n. Now, a particular AiA_i is a matrix containing a single Jordan block of AA and a bunch of zeros. I think we can compute powers of AiA_i just by computing the powers of the nonzero part of AiA_i - call this AiA_i' - which is a single Jordan block of AA, and then adding back in all the extra zeros. If AA has a Jordan block AiA_i' containing 1's off the main diagonal and having a nonzero main diagonal, then powers of AiA_i' have nonzero entries off the diagonal too (by the discussion above). That means that powers of AiA_i have nonzero entries off the main diagonal, and An=A1n+A2n+AmnA^n =A_1^n + A_2^n + \dots A_m^n also has nonzero entries off the main diagonal.

view this post on Zulip John Baez (Sep 05 2023 at 10:24):

David Egolf said:

It seems handy to know that different Jordan blocks in a matrix AA don't interact with each other when we multiply AA by itself or add AA to multiples of itself. Given this, it makes a lot of sense to study how a single Jordan block interacts with itself under matrix multiplication.

Yes! And perhaps I didn't shout this loud enough, though it's clear from what I said: to understand how a single Jordan block AA interacts with itself under addition and multiplication, we just need to understand this for the matrix NN having 1's directly above the main diagonal, since

A=λI+NA = \lambda I + N

and the rest is the binomial formula.

So, remarkably, we've reduced the study of polynomials of matrices to the case of this one matrix NN - or really one n×nn \times n matrix NN for each nn!

This is the power and beauty of Jordan normal form.

view this post on Zulip John Baez (Sep 05 2023 at 10:31):

It's also good to know that 'almost every' complex matrix is diagonalizable, in several very precise senses. So all the fancier Jordan blocks, beyond the 1 ×\times 1 blocks, are 'rarely needed'.

view this post on Zulip John Baez (Sep 05 2023 at 10:32):

However, 'rarely needed' is an exaggeration. For example consider differentiation acting on the space of complex polynomials of degree nn. In the right basis this linear operator is just our friend NN!

view this post on Zulip David Egolf (Sep 05 2023 at 19:53):

John Baez said:

However, 'rarely needed' is an exaggeration. For example consider differentiation acting on the space of complex polynomials of degree nn. In the right basis this linear operator is just our friend NN!

The NN matrix acts by taking in one basis vector and spitting out another one (except in the case where it produces zero). If you put in the ithi_{th} basis vector, it spits out the (i1)th(i-1)_{th} basis vector (or zero).

If DD represents the derivative operator, then D(1)=0D(1) = 0, D(x)=1D(x) = 1, D((1/2)x2)=xD((1/2) x^2) = x, D((1/6)x3)=(1/2)x2D((1/6) x^3) = (1/2) x^2, and so on. So, to represent differentiation of polynomials of up to degree nn using the matrix NN, I'm guessing we'd want our basis to look like {1,x,(1/2)x2,(1/6)x3,,(1/n!)xn}\{1, x, (1/2) x^2, (1/6) x^3, \dots, (1/n!)x^n\}. Neat!

view this post on Zulip John Baez (Sep 05 2023 at 20:20):

Yes! And arguably this explains the appearance of 1/n!1/n! in Taylor's theorem.