Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: deprecated: mathematics

Topic: characters of finite group representations

John Baez (Sep 03 2023 at 09:45):

@Todd Trimble recently told me a slick proof of a cool fact he learned from Charles Rezk on MathOverflow.

First a warmup fact. Suppose ρ is a representation of a group G on a finite-dimensional complex vector space. If g ∈ G then tr(ρ(g)) is a finite sum of nth roots of unity, where n is the order of g: that is, the smallest integer n > 0 with 𝑔ⁿ = 1.

Why? Well, we have ρ(g)ⁿ = 1. You can always diagonalize a complex matrix whose nth power is 1. (See why?) So do that. Then its diagonal entries have to be nth roots of unity. So tr(ρ(g)) is a sum of nth roots of unity.

Now for the cool part. If G is a symmetric group, then tr(ρ(g)) is an integer!

This is harder, but still quick. Let F be the field consisting of all rational linear combinations of nth roots of unity. If k is relatively prime to n there's an automorphism of F sending each nth root of unity to its kth power. Applying this automorphism to the entries of our diagonalized matrix ρ(g) we get the matrix ρ(g)ᵏ.

But for a permutation with gⁿ = 1, gᵏ has the same cycle structure as g whenever k is relatively prime to n. (Draw some examples). So gᵏ is conjugate to g. So ρ(g)ᵏ is conjugate to ρ(g). But the trace of a matrix doesn't change when you conjugate that matrix by something, so tr(ρ(g)ᵏ) = tr(ρ(g)).

We've learned that tr(ρ(g)) is a sum of nth roots of unity, but it also doesn't change when we apply the automorphism of F sending any nth root of unity to its kth power, whenever k is relatively prime to n. By some Galois theory, this forces tr(ρ(g)) to be an integer!

(Charles Rezk said this much faster since he assumed you know all the background. I've tried to minimize the amount of jargon. But we still need some Galois theory at the end - basically, the only algebraic integers in F invariant under the Galois group of F over ℚ are ordinary integers. Maybe there's some easy way to see this fact in the special case at hand.)

David Egolf (Sep 03 2023 at 15:58):

John Baez said:

You can always diagonalize a complex matrix whose nth power is 1. (See why?)

I don't see why, but I would like to! I tried to figure this out below, but I got stuck.

The minimal polynomial of a matrix $A$ is the polynomial $m$ of least degree so that $m(A)=0$ . I guess that if $n$ is the smallest power such that $A^n=I$ , then the minimal polynomial of $A$ is $m=x^n-1$ , so that $m(A) = A^n-I = 0$ . I'm not sure how I could prove this, though.

To show that $A$ is diagonalizable, I will try to show that $A$ has no repeated eigenvalues. I read on Wikipedia that the roots of the minimal polynomial $m$ are exactly the eigenvalues of $A$ . In our case the minimal polynomial $m$ has $n$ distinct roots, namely the $n_{th}$ roots of 1 in $\mathbb{C}$ . So, $A$ has at least $n$ distinct eigenvalues. If we knew that $A$ was an $n \times n$ matrix, then it could have at most $n$ distinct eigenvalues, and so we could conclude that all of its eigenvalues must be distinct, and so it must be diagonalizable.

Can we show that a square matrix $A$ must be an $n \times n$ matrix if $n$ is the smallest positive integer so $A^n=I$ ? No - this is false. The 1x1 matrix $-1$ satisfies $(-1)^2=1$ , but $(-1)^1 \neq 1$ . So, it's back to the drawing board!

Jorge Soto-Andrade (Sep 03 2023 at 16:42):

David Egolf said:

John Baez said:

You can always diagonalize a complex matrix whose nth power is 1. (See why?)

I don't see why, but I would like to! I tried to figure this out below, but I got stuck.

The minimal polynomial of a matrix $A$ is the polynomial $m$ of least degree so that $m(A)=0$ . I guess that if $n$ is the smallest power such that $A^n=I$ , then the minimal polynomial of $A$ is $m=x^n-1$ , so that $m(A) = A^n-I = 0$ . I'm not sure how I could prove this, though.

To show that $A$ is diagonalizable, I will try to show that $A$ has no repeated eigenvalues. I read on Wikipedia that the roots of the minimal polynomial $m$ are exactly the eigenvalues of $A$ . In our case the minimal polynomial $m$ has $n$ distinct roots, namely the $n_{th}$ roots of 1 in $\mathbb{C}$ . So, $A$ has at least $n$ distinct eigenvalues. If we knew that $A$ was an $n \times n$ matrix, then it could have at most $n$ distinct eigenvalues, and so we could conclude that all of its eigenvalues must be distinct, and so it must be diagonalizable.

Can we show that a square matrix $A$ must be an $n \times n$ matrix if $n$ is the smallest positive integer so $A^n=I$ ? No - this is false. The 1x1 matrix $-1$ satisfies $(-1)^2=1$ , but $(-1)^1 \neq 1$ . So, it's back to the drawing board!

Simpler, I think, is to notice that if $A^n=I$ then you can explicitly describe the (full family of) "eigen-projectors" $P_\omega$ , where $\omega$ is an n-th root of unity as. $P_\omega =\sum_{0\leq k \leq n-1} \bar \omega^kA^k$ To do spectral theory is better to work in "operator algebra". To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A. If you have a minimal polynomial equation satisfied by A you can explicitly describe those proyectors. Think of the baby example. $A^2 = I$ . and. then $A^3 = Id$ .

David Egolf (Sep 03 2023 at 17:46):

Thanks @Jorge Soto-Andrade for your explanation! Unfortunately, I am not following very well.

I'd especially appreciate any elaboration/explanation anyone could offer on the quote below:

Jorge Soto-Andrade said:

To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A.

For example, if we have managed to diagonalize $A$ as $A=PDP^{-1}$ for a diagonal matrix $D$ , how does this diagonalization induce a decomposition of the identity into eigen projectors?

(References to a textbook that talks about this would also be welcome!)

Jorge Soto-Andrade (Sep 03 2023 at 18:51):

David Egolf said:

Thanks Jorge Soto-Andrade for your explanation! Unfortunately, I am not following very well.

I'd especially appreciate any elaboration/explanation anyone could offer on the quote below:

Jorge Soto-Andrade said:

To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A.

For example, if we have managed to diagonalize $A$ as $A=PDP^{-1}$ for a diagonal matrix $D$ , how does this diagonalization induce a decomposition of the identity into eigen projectors?

(References to a textbook that talks about this would also be welcome!)
I am afraid that this way of fathoming diagonalisation is not the best one for "everyday life" what is really useful is to be able to find an eigen basis for A (an endomorphism of V), i.e. a basis of our vector space V, consisting of eigenvectors of A. Then you can easily figure out what happens with the powers of A for example (a system theoretical motivation for linear algebra). Then you have the eigen spaces $$V_\lambda$$ consisting of all eigenvectors associated to the same eigenvalue $$\lambda$$. They provide a direct sum decomposition of V. This is a "better" definition of diagonalisation I claim. Now, you can climb to the next floor and look at the operators (endomorphisms) of V (the Ruskis do this very well in their textbooks, better than American textbooks I am afraid - i am looking for some nice reference in English though ). So, you look at Projectors $P_\lambda$ of $V$ onto $V_\lambda$ ALONG the other eigenspaces $V_\mu$ . This means that $P_\lambda$ is the identity on $V_\lambda$ and equal to 0 on $V_\mu$ for
$\mu \neq \lambda$ . Then you have that the sum of all eigenprojectors add up to the Identity of $V$ and so on. So, diagonalise A means to decompose the Identity operator of V in a way A would love... Maybe you should work this out (in matrix form if you like) for the baby example $A^2 = I$ , in dimension 2, say.

Jorge Soto-Andrade (Sep 03 2023 at 19:18):

Jorge Soto-Andrade said:

David Egolf said:
Thanks Jorge Soto-Andrade for your explanation! Unfortunately, I am not following very well.

I'd especially appreciate any elaboration/explanation anyone could offer on the quote below:

Jorge Soto-Andrade said:

To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A.

For example, if we have managed to diagonalize $A$ as $A=PDP^{-1}$ for a diagonal matrix $D$ , how does this diagonalization induce a decomposition of the identity into eigen projectors?

(References to a textbook that talks about this would also be welcome!)
I am afraid that this way of fathoming diagonalisation is not the best one for "everyday life" what is really useful is to be able to find an eigen basis for A (an endomorphism of V), i.e. a basis of our vector space V, consisting of eigenvectors of A. Then you can easily figure out what happens with the powers of A for example (a system theoretical motivation for linear algebra). Then you have the eigen spaces $$V_\lambda$$ consisting of all eigenvectors associated to the same eigenvalue $$\lambda$$. They provide a direct sum decomposition of V. This is a "better" definition of diagonalisation I claim. Now, you can climb to the next floor and look at the operators (endomorphisms) of V (the Ruskis do this very well in their textbooks, better than American textbooks I am afraid - i am looking for some nice reference in English though ). So, you look at Projectors $P_\lambda$ of $V$ onto $V_\lambda$ ALONG the other eigenspaces $V_\mu$ . This means that $P_\lambda$ is the identity on $V_\lambda$ and equal to 0 on $V_\mu$ for
$\mu \neq \lambda$ . Then you have that the sum of all eigenprojectors add up to the Identity of $V$ and so on. So, diagonalise A means to decompose the Identity operator of V in a way A would love... Maybe you should work this out (in matrix form if you like) for the baby example $A^2 = I$ , in dimension 2, say.
You may check “Linear algebra done right” by Axler:
linear.axler.net  also available at the Ruski site gen.lib.rus.ec.    His viewpoint is close to ours

David Egolf (Sep 03 2023 at 19:26):

Awesome, that looks really helpful @Jorge Soto-Andrade ! It will take me a little time to work through this and hopefully understand it.

James Deikun (Sep 03 2023 at 19:49):

David Egolf said:

For example, if we have managed to diagonalize $A$ as $A=PDP^{-1}$ for a diagonal matrix $D$ , how does this diagonalization induce a decomposition of the identity into eigen projectors?

An $n$ by $n$ diagonal matrix $D$ has $n$ eigen projectors $p_k$ which each has a 1 on the diagonal at $k,k$ and the rest of the matrix 0. Then you take $Pp_{k}P^{-1}$ as eigen projectors for $A$ .

James Deikun (Sep 03 2023 at 19:54):

It doesn't work out this nicely for infinite matrices. The projectors arise from

the fact that a matrix has eigenvalues, which can fail for infinite matrices;
the fact that finite coproducts are the same thing as finite products in the category of vector spaces, which always fails for infinite vector spaces.

Jorge Soto-Andrade (Sep 03 2023 at 20:31):

James Deikun said:

It doesn't work out this nicely for infinite matrices. The projectors arise from

the fact that a matrix has eigenvalues, which can fail for infinite matrices;

the fact that finite coproducts are the same thing as finite products in the category of vector spaces, which always fails for infinite vector spaces.

Yes, but then you go over to the spectral measure and get an integral instead of the discrete direct sum decomposition of V into eigenspaces.

Jorge Soto-Andrade (Sep 03 2023 at 20:39):

John Baez said:

Todd Trimble recently told me a slick proof of a cool fact he learned from Charles Rezk on MathOverflow.

First a warmup fact. Suppose ρ is a representation of a group G on a finite-dimensional complex vector space. If g ∈ G then tr(ρ(g)) is a finite sum of nth roots of unity, where n is the order of g: that is, the smallest integer n > 0 with 𝑔ⁿ = 1.

Why? Well, we have ρ(g)ⁿ = 1. You can always diagonalize a complex matrix whose nth power is 1. (See why?) So do that. Then its diagonal entries have to be nth roots of unity. So tr(ρ(g)) is a sum of nth roots of unity.

Now for the cool part. If G is a symmetric group, then tr(ρ(g)) is an integer!

This is harder, but still quick. Let F be the field consisting of all rational linear combinations of nth roots of unity. If k is relatively prime to n there's an automorphism of F sending each nth root of unity to its kth power. Applying this automorphism to the entries of our diagonalized matrix ρ(g) we get the matrix ρ(g)ᵏ.

But for a permutation with gⁿ = 1, gᵏ has the same cycle structure as g whenever k is relatively prime to n. (Draw some examples). So gᵏ is conjugate to g. So ρ(g)ᵏ is conjugate to ρ(g). But the trace of a matrix doesn't change when you conjugate that matrix by something, so tr(ρ(g)ᵏ) = tr(ρ(g)).

We've learned that tr(ρ(g)) is a sum of nth roots of unity, but it also doesn't change when we apply the automorphism of F sending any nth root of unity to its kth power, whenever k is relatively prime to n. By some Galois theory, this forces tr(ρ(g)) to be an integer!

(Charles Rezk said this much faster since he assumed you know all the background. I've tried to minimize the amount of jargon. But we still need some Galois theory at the end - basically, the only algebraic integers in F invariant under the Galois group of F over ℚ are ordinary integers. Maybe there's some easy way to see this fact in the special case at hand.)

Nice remark. The key point seems to be that if you Galois conjugate a linear representation of the symmetric group, you get an isomorphic representation. I would not resist the temptation to claim that a "deep" proof of this is that follows from the well known fact that the symmetric group $S_n$ is $PGL(n, \mathbb F_1)$$ the general linear group over "the field with one element" (this is the famous "Boutade de Tits"). The irreducible representations of $PGL(n, \mathbb F_q)$ split into series parameterized by the multiplicative characters of the base finite field and its extensions. When you Galois conjugate, you change in general the parameter value. But for the field with one element all extensions collapse into the base field ("because" every power of 1 is 1) and you have no parameters.

Jorge Soto-Andrade (Sep 03 2023 at 21:37):

Jorge Soto-Andrade said:

John Baez said:

Todd Trimble recently told me a slick proof of a cool fact he learned from Charles Rezk on MathOverflow.

First a warmup fact. Suppose ρ is a representation of a group G on a finite-dimensional complex vector space. If g ∈ G then tr(ρ(g)) is a finite sum of nth roots of unity, where n is the order of g: that is, the smallest integer n > 0 with 𝑔ⁿ = 1.

Why? Well, we have ρ(g)ⁿ = 1. You can always diagonalize a complex matrix whose nth power is 1. (See why?) So do that. Then its diagonal entries have to be nth roots of unity. So tr(ρ(g)) is a sum of nth roots of unity.

Now for the cool part. If G is a symmetric group, then tr(ρ(g)) is an integer!

This is harder, but still quick. Let F be the field consisting of all rational linear combinations of nth roots of unity. If k is relatively prime to n there's an automorphism of F sending each nth root of unity to its kth power. Applying this automorphism to the entries of our diagonalized matrix ρ(g) we get the matrix ρ(g)ᵏ.

But for a permutation with gⁿ = 1, gᵏ has the same cycle structure as g whenever k is relatively prime to n. (Draw some examples). So gᵏ is conjugate to g. So ρ(g)ᵏ is conjugate to ρ(g). But the trace of a matrix doesn't change when you conjugate that matrix by something, so tr(ρ(g)ᵏ) = tr(ρ(g)).

We've learned that tr(ρ(g)) is a sum of nth roots of unity, but it also doesn't change when we apply the automorphism of F sending any nth root of unity to its kth power, whenever k is relatively prime to n. By some Galois theory, this forces tr(ρ(g)) to be an integer!

(Charles Rezk said this much faster since he assumed you know all the background. I've tried to minimize the amount of jargon. But we still need some Galois theory at the end - basically, the only algebraic integers in F invariant under the Galois group of F over ℚ are ordinary integers. Maybe there's some easy way to see this fact in the special case at hand.)

Nice remark. The key point seems to be that if you Galois conjugate a linear representation of the symmetric group, you get an isomorphic representation, so that its character is a Galois invariant algebraic integer, so a rational integer. I would not resist the temptation to claim that a "deep" proof of this is that follows from the well known fact that the symmetric group $S_n$ is $PGL(n, \mathbb F_1)$$ the general linear group over "the field with one element" (this is the famous "Boutade de Tits"). The irreducible representations of $PGL(n, \mathbb F_q)$ split into series parameterized by the multiplicative characters of the base finite field and its extensions. When you Galois conjugate, you change in general the parameter value. But for the field with one element all extensions collapse into the base field ("because" every power of 1 is 1) and you have no parameters.

In a less metaphorical vein, the same Galois argument applies to show that the Gelfand character of any finite group $G$ is integer valued (the Gelfand character is the sum of all irreducible characters of $G$ , each one exactly once). Now, If you calculate the Gelfand character for the usual groups you always find non negative integer values. I conjectured for a while that this was always true (the Gelfand character looks very much like a permutation representation associated to a transitive $G$-space, for example its mean value is 1) until I stated this conjecture during a talk in Tokyo and the day after Yokonuma came to see me to apologise (with a deep bow) saying that last night he had calculated the Gelfand Character for the Mathieu group M11 and found negative values, so my conjecture was disproved... Gelfand himself told me later that he had made the same conjecture and asked his students to calculate the Gelfand character of every finite group in sight but that they were too lazy to do that ...
There are still several open questions related to Gelfand models. Gelfand's intuition was that they were representations with "very special properties". Notice that for $G = SO(3, \mathbb R)$ a Gelfand model is provided by the natural unitary representation of $G$ in the Hilbert space $L^2(S^2)$ whose decomposition into irreducibles involves the famous Legendre polynomials (as the kernels of the eigenprojectors as integral operators, in the sense of functional analysis). Here by eigenprojectors I mean the projectors associated to the decomposition of $L^2(S^2)$ into a Hilbert direct sum of irreducibles (which are for sure eigenprojectors for some Laplacian-like intertwining operator).

John Baez (Sep 04 2023 at 08:20):

David Egolf said:

John Baez said:

You can always diagonalize a complex matrix whose nth power is 1. (See why?)

I don't see why, but I would like to! I tried to figure this out below, but I got stuck.

I see two main ways, the "hand-crafted way" where you develop an argument specifically for this problem, or the "mechanical" way where you use the technology developed to solve many similar problems.

John Baez (Sep 04 2023 at 08:22):

You're trying to diagonalize a complex matrix here. You can't diagonalize every complex matrix. How close can you come? This is what the Jordan normal form tells you!

John Baez (Sep 04 2023 at 08:24):

For any linear transformation of a finite-dimensional vector space, you can find a basis such that the transformation is described by a block diagonal matrix of this form:

John Baez (Sep 04 2023 at 08:26):

It has an arbitrary number of blocks of arbitrary size; each block has some fixed number $\lambda_i$ running down the diagonal and 1's directly above the diagonal. All the other entries are zero.

John Baez (Sep 04 2023 at 08:27):

The numbers $\lambda_i$ are the eigenvalues of your transformation, and there's one eigenvector (up to scalar multiples) for each block.

(If two or more blocks have equal $\lambda_i$ 's you get more eigenvectors by taking linear combinations of the eigenvectors for these blocks.)

John Baez (Sep 04 2023 at 08:29):

This result is a power tool for studying linear transformations of a finite-dimensional complex vector space! So if someone tells you "I have a complex matrix whose nth power is 1", you can change basis to put it in Jordan normal form, and see what it must be like for its nth power to equal 1.

John Baez (Sep 04 2023 at 08:37):

Being a lazy bum I reached for this "mechanical" method rather than trying to develop a hand-crafted method.

David Egolf (Sep 04 2023 at 16:04):

Thanks for explaining, @John Baez ! I've seen the Jordan normal form before, but have never been comfortable with it. So it's great to have an example where it's useful for something I'm trying to understand.

Let me see how this works. We have a linear map going from a complex finite-dimensional vector space $V$ to itself, $\rho(g): V \to V$ , where $\rho(g)^n = 1_V$ . There exists a basis for $V$ so that $\rho(g)$ can be described by a matrix $A$ in Jordan normal form. In that basis, for any integer $m>0$ , the linear map $\rho(g)^m$ is described by the matrix $A^m$ .

I am guessing that if $A$ has any off-diagonal 1's, then $A^m$ will have off-diagonal non-zero entries as well, for all $m > 0$ . I started trying to prove this, but it was trickier than I expected! To prove this though, I suspect it is important that $\rho(g)$ must be invertible, so that $A$ has no zero entries on its diagonal. (And since $\rho(g)^n=1_V$ , $\rho(g)$ must be invertible).

If we assume this, however, then if $A^m=I$ , which has no off-diagonal non-zero entries, then we conclude that the Jordan normal form $A$ for $\rho(g)$ must be diagonal. So, there exists a basis for $\rho(g)$ in which the matrix describing $\rho(g)$ is diagonal, which is what we wanted to show.

EDIT:
I think I understand why if the Jordan normal form matrix $A$ has any off-diagonal 1's, then $A^m$ for $m>0$ will have off-diagonal non-zero entries too. If $A$ has an off-diagonal 1, then for some basis vectors $b_i$ and $b_{i-1}$ (in the basis that generates the Jordan normal form) we have $\rho(g)(b_i) = \lambda_i b_i + 1 b_{i-1}$ . If we keep applying $\rho(g)$ to this result, we never get to map the $b_{i-1}$ to zero (because $\rho(g)$ is invertible) and we never get to map the $b_{i-1}$ back into some amount of $b_i$ (by the structure of the Jordan normal form matrix).

At least that's my intuition. Looking at this again, I think there is some more work to do to upgrade this to a proof.

John Baez (Sep 04 2023 at 19:42):

David Egolf said:

Thanks for explaining, John Baez ! I've seen the Jordan normal form before, but have never been comfortable with it. So it's great to have an example where it's useful for something I'm trying to understand.

Yes, I think you need to use it to appreciate it. Jordan normal form makes you the "king of linear transformations" - or at least king of any one linear transformation, since it lets you quickly settle many questions about a single linear transformation $T$ and polynomials $a_n T^n + \cdots + a_1 T + a_0$ .

I am guessing that if $A$ has any off-diagonal 1's, then $A^m$ will have off-diagonal non-zero entries as well, for all $m > 0$ .

That's true if also $A$ has enough nonzero on-diagonal entries. Otherwise we get counterexamples like

$A = \left( \begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array} \right)$

which has $A^2 = 0$ .

But anyway, yes: this is the key to solving my puzzle, if you want to do it with Jordan normal forms.

I started trying to prove this, but it was trickier than I expected! To prove this though, I suspect it is important that $\rho(g)$ must be invertible, so that $A$ has no zero entries on its diagonal.

Good! But that's more than you need. It's easiest to say what's going on for a matrix $A$ in Jordan normal form that has just a single block. If its diagonal entries are nonzero and it has at least one nonzero off-diagonal entry, all its powers also have nonzero off-diagonal entries... so none of its powers can equal 1.

But to get $A^n = 1$ , the diagonal entries of $A$ can't be zero.

So, to get $A^n = 1$ , the off-diagonal entries of $A$ must be zero.

So, any matrix whose nth power is $1$ for some $n$ , $A$ must be diagonalizable!

John Baez (Sep 04 2023 at 19:49):

David Egolf said:

At least that's my intuition. Looking at this again, I think there is some more work to do to upgrade this to a proof.

The most important thing is to get some intuition for how to use Jordan normal form to do stuff... you can always rigorize your argument later.

John Baez (Sep 04 2023 at 19:50):

But here are some extra tips! First, it's good to think about a single Jordan block since when you apply any polynomial to a matrix in Jordan normal form you just apply it to each block: the different blocks don't talk to each other.

John Baez (Sep 04 2023 at 19:51):

Second, we can always write a matrix that's a single Jordan block as

$\lambda I + N$

where $I$ is the identity matrix and $N$ is a matrix with 1's right above the diagonal.

John Baez (Sep 04 2023 at 19:55):

The matrix $N$ is what we call nilpotent: if we raise it to a high enough power we get zero!

It's good to take a matrix like this and square it, cube it, etc. to get an intuition for how this works, and when you get $N^n = 0$ . This matrix is very important, since nilpotent transformations are important, and any nilpotent matrix has a Jordan form where all the Jordan blocks are of the form $N$ . (Of course this matrix I'm calling $N$ comes in various sizes.)

John Baez (Sep 04 2023 at 19:57):

I guess I need to say: $N$ has all its nonzero entries one above the main diagonal, $N^2$ has all its nonzero entries two above the main diagonal, and so on.

John Baez (Sep 04 2023 at 19:59):

So, what happens if we take a power of

$A = \lambda I + N$ ?

We get things like

$A^2 = \lambda^2 + 2 \lambda N + N^2$

$A^3 = \lambda^3 + 3 \lambda^2 N + 3 \lambda N + N^3$

John Baez (Sep 04 2023 at 20:00):

The matrix $N^k$ has all its entries $k$ above the main diagonal, so the different terms in these sums can't cancel each other!

John Baez (Sep 04 2023 at 20:01):

So, if $\lambda \ne 0$ , all the powers of $A$ will have nonzero entries off the main diagonal.

John Baez (Sep 04 2023 at 20:03):

This is a way of proving this claim:

It's easiest to say what's going on for a matrix $A$ in Jordan normal form that has just a single block. If its diagonal entries are nonzero and it has at least one nonzero off-diagonal entry, all its powers also have nonzero off-diagonal entries... so none of its powers can equal 1.

So, any matrix whose nth power is $1$ for some $n$ , $A$ must be diagonalizable!

David Egolf (Sep 04 2023 at 21:59):

That is very cool! It seems handy to know that different Jordan blocks in a matrix $A$ don't interact with each other when we multiply $A$ by itself or add $A$ to multiples of itself. Given this, it makes a lot of sense to study how a single Jordan block interacts with itself under matrix multiplication.

I wrote out some powers of $N$ for various sizes of $N$ . It looks like each multiplication of $N$ shifts all of its columns one to the right, bringing in a zero column on the left. This shift moves all the 1's one away from the main diagonal, and so we'd expect $N^m$ to have 1's $m$ above the main diagonal.

(By the way, I think $A^3 = \lambda^3 I + 3 \lambda^2 N + 3 \lambda N^2 + N^3$ ).

Using the binomial expansion theorem for $A^n = ((\lambda I) + N)^n = \sum_{k = 0}^n \binom{n}{k} (\lambda I)^{n-k} N^k$ , we see that (for $n>0$ ) this sum always contains a $k=1$ term, $\binom{n}{1} (\lambda I)^{n-1} N^1 = n \lambda^{n-1} N$ which has nonzero entries one above the main diagonal. And as you pointed out, all the other terms in this sum have nonzero entries at different "heights" above the main diagonal, so these nonzero off-the-main-diagonal entries in the $k=1$ term can't cancel out with the other terms. So, when $A = \lambda I + N$ (for $\lambda \neq 0$ ), $A^n$ really does have nonzero entries off the main diagonal, for each $n>0$ .

Now, assume we have a matrix $A$ in Jordan normal form, but with potentially more than one Jordan block. I will assume that different blocks do not interact with eachother under matrix multiplication or matrix addition. To take advantage of this, I write $A$ as $A = A_1 + A_2 + \dots + A_m$ , where each matrix $A_i$ contains the $i_{th}$ Jordan block of $A$ and is otherwise zero. Then I am assuming that $A^2 = (A_1 + A_2 + \dots A_m)(A_1 + A_2 + \dots + A_m) = A_1^2 + A_2^2 + \dots + A_m^2$ and in general $A^n =A_1^n + A_2^n + \dots + A_m^n$ . Now, a particular $A_i$ is a matrix containing a single Jordan block of $A$ and a bunch of zeros. I think we can compute powers of $A_i$ just by computing the powers of the nonzero part of $A_i$ - call this $A_i'$ - which is a single Jordan block of $A$ , and then adding back in all the extra zeros. If $A$ has a Jordan block $A_i'$ containing 1's off the main diagonal and having a nonzero main diagonal, then powers of $A_i'$ have nonzero entries off the diagonal too (by the discussion above). That means that powers of $A_i$ have nonzero entries off the main diagonal, and $A^n =A_1^n + A_2^n + \dots A_m^n$ also has nonzero entries off the main diagonal.

John Baez (Sep 05 2023 at 10:24):

David Egolf said:

It seems handy to know that different Jordan blocks in a matrix $A$ don't interact with each other when we multiply $A$ by itself or add $A$ to multiples of itself. Given this, it makes a lot of sense to study how a single Jordan block interacts with itself under matrix multiplication.

Yes! And perhaps I didn't shout this loud enough, though it's clear from what I said: to understand how a single Jordan block $A$ interacts with itself under addition and multiplication, we just need to understand this for the matrix $N$ having 1's directly above the main diagonal, since

$A = \lambda I + N$

and the rest is the binomial formula.

So, remarkably, we've reduced the study of polynomials of matrices to the case of this one matrix $N$ - or really one $n \times n$ matrix $N$ for each $n$ !

This is the power and beauty of Jordan normal form.

John Baez (Sep 05 2023 at 10:31):

It's also good to know that 'almost every' complex matrix is diagonalizable, in several very precise senses. So all the fancier Jordan blocks, beyond the 1 $\times$ 1 blocks, are 'rarely needed'.

John Baez (Sep 05 2023 at 10:32):

However, 'rarely needed' is an exaggeration. For example consider differentiation acting on the space of complex polynomials of degree $n$ . In the right basis this linear operator is just our friend $N$ !

David Egolf (Sep 05 2023 at 19:53):

John Baez said:

However, 'rarely needed' is an exaggeration. For example consider differentiation acting on the space of complex polynomials of degree $n$ . In the right basis this linear operator is just our friend $N$ !

The $N$ matrix acts by taking in one basis vector and spitting out another one (except in the case where it produces zero). If you put in the $i_{th}$ basis vector, it spits out the $(i-1)_{th}$ basis vector (or zero).

If $D$ represents the derivative operator, then $D(1) = 0$ , $D(x) = 1$ , $D((1/2) x^2) = x$ , $D((1/6) x^3) = (1/2) x^2$ , and so on. So, to represent differentiation of polynomials of up to degree $n$ using the matrix $N$ , I'm guessing we'd want our basis to look like $\{1, x, (1/2) x^2, (1/6) x^3, \dots, (1/n!)x^n\}$ . Neat!

John Baez (Sep 05 2023 at 20:20):

Yes! And arguably this explains the appearance of $1/n!$ in Taylor's theorem.