You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
@Todd Trimble recently told me a slick proof of a cool fact he learned from Charles Rezk on MathOverflow.
First a warmup fact. Suppose ρ is a representation of a group G on a finite-dimensional complex vector space. If g ∈ G then tr(ρ(g)) is a finite sum of nth roots of unity, where n is the order of g: that is, the smallest integer n > 0 with 𝑔ⁿ = 1.
Why? Well, we have ρ(g)ⁿ = 1. You can always diagonalize a complex matrix whose nth power is 1. (See why?) So do that. Then its diagonal entries have to be nth roots of unity. So tr(ρ(g)) is a sum of nth roots of unity.
Now for the cool part. If G is a symmetric group, then tr(ρ(g)) is an integer!
This is harder, but still quick. Let F be the field consisting of all rational linear combinations of nth roots of unity. If k is relatively prime to n there's an automorphism of F sending each nth root of unity to its kth power. Applying this automorphism to the entries of our diagonalized matrix ρ(g) we get the matrix ρ(g)ᵏ.
But for a permutation with gⁿ = 1, gᵏ has the same cycle structure as g whenever k is relatively prime to n. (Draw some examples). So gᵏ is conjugate to g. So ρ(g)ᵏ is conjugate to ρ(g). But the trace of a matrix doesn't change when you conjugate that matrix by something, so tr(ρ(g)ᵏ) = tr(ρ(g)).
We've learned that tr(ρ(g)) is a sum of nth roots of unity, but it also doesn't change when we apply the automorphism of F sending any nth root of unity to its kth power, whenever k is relatively prime to n. By some Galois theory, this forces tr(ρ(g)) to be an integer!
(Charles Rezk said this much faster since he assumed you know all the background. I've tried to minimize the amount of jargon. But we still need some Galois theory at the end - basically, the only algebraic integers in F invariant under the Galois group of F over ℚ are ordinary integers. Maybe there's some easy way to see this fact in the special case at hand.)
John Baez said:
You can always diagonalize a complex matrix whose nth power is 1. (See why?)
I don't see why, but I would like to! I tried to figure this out below, but I got stuck.
The minimal polynomial of a matrix is the polynomial of least degree so that . I guess that if is the smallest power such that , then the minimal polynomial of is , so that . I'm not sure how I could prove this, though.
To show that is diagonalizable, I will try to show that has no repeated eigenvalues. I read on Wikipedia that the roots of the minimal polynomial are exactly the eigenvalues of . In our case the minimal polynomial has distinct roots, namely the roots of 1 in . So, has at least distinct eigenvalues. If we knew that was an matrix, then it could have at most distinct eigenvalues, and so we could conclude that all of its eigenvalues must be distinct, and so it must be diagonalizable.
Can we show that a square matrix must be an matrix if is the smallest positive integer so ? No - this is false. The 1x1 matrix satisfies , but . So, it's back to the drawing board!
David Egolf said:
John Baez said:
You can always diagonalize a complex matrix whose nth power is 1. (See why?)
I don't see why, but I would like to! I tried to figure this out below, but I got stuck.
The minimal polynomial of a matrix is the polynomial of least degree so that . I guess that if is the smallest power such that , then the minimal polynomial of is , so that . I'm not sure how I could prove this, though.
To show that is diagonalizable, I will try to show that has no repeated eigenvalues. I read on Wikipedia that the roots of the minimal polynomial are exactly the eigenvalues of . In our case the minimal polynomial has distinct roots, namely the roots of 1 in . So, has at least distinct eigenvalues. If we knew that was an matrix, then it could have at most distinct eigenvalues, and so we could conclude that all of its eigenvalues must be distinct, and so it must be diagonalizable.
Can we show that a square matrix must be an matrix if is the smallest positive integer so ? No - this is false. The 1x1 matrix satisfies , but . So, it's back to the drawing board!
Simpler, I think, is to notice that if then you can explicitly describe the (full family of) "eigen-projectors" , where is an n-th root of unity as. To do spectral theory is better to work in "operator algebra". To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A. If you have a minimal polynomial equation satisfied by A you can explicitly describe those proyectors. Think of the baby example. . and. then .
Thanks @Jorge Soto-Andrade for your explanation! Unfortunately, I am not following very well.
I'd especially appreciate any elaboration/explanation anyone could offer on the quote below:
Jorge Soto-Andrade said:
To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A.
For example, if we have managed to diagonalize as for a diagonal matrix , how does this diagonalization induce a decomposition of the identity into eigen projectors?
(References to a textbook that talks about this would also be welcome!)
David Egolf said:
Thanks Jorge Soto-Andrade for your explanation! Unfortunately, I am not following very well.
I'd especially appreciate any elaboration/explanation anyone could offer on the quote below:
Jorge Soto-Andrade said:
To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A.
For example, if we have managed to diagonalize as for a diagonal matrix , how does this diagonalization induce a decomposition of the identity into eigen projectors?
(References to a textbook that talks about this would also be welcome!)
I am afraid that this way of fathoming diagonalisation is not the best one for "everyday life" what is really useful is to be able to find an eigen basis for A (an endomorphism of V), i.e. a basis of our vector space V, consisting of eigenvectors of A. Then you can easily figure out what happens with the powers of A for example (a system theoretical motivation for linear algebra). Then you have the eigen spaces $$V_\lambda$$ consisting of all eigenvectors associated to the same eigenvalue $$\lambda$$. They provide a direct sum decomposition of V. This is a "better
" definition of diagonalisation I claim. Now, you can climb to the next floor and look at the operators (endomorphisms) of V (the Ruskis do this very well in their textbooks, better than American textbooks I am afraid - i am looking for some nice reference in English though ). So, you look at Projectors of onto ALONG the other eigenspaces . This means that is the identity on and equal to 0 on for
. Then you have that the sum of all eigenprojectors add up to the Identity of and so on. So, diagonalise A means to decompose the Identity operator of V in a way A would love... Maybe you should work this out (in matrix form if you like) for the baby example , in dimension 2, say.
Jorge Soto-Andrade said:
David Egolf said:
Thanks Jorge Soto-Andrade for your explanation! Unfortunately, I am not following very well.
I'd especially appreciate any elaboration/explanation anyone could offer on the quote below:
Jorge Soto-Andrade said:
To diagonalize a matrix A (= linear endomorfism) is equivalent to find a decomposition of the identity into a family of (orthogonal) projectors which are eigen projectors (they proyect onto the eigen spaces of A.
For example, if we have managed to diagonalize as for a diagonal matrix , how does this diagonalization induce a decomposition of the identity into eigen projectors?
(References to a textbook that talks about this would also be welcome!)
I am afraid that this way of fathoming diagonalisation is not the best one for "everyday life" what is really useful is to be able to find an eigen basis for A (an endomorphism of V), i.e. a basis of our vector space V, consisting of eigenvectors of A. Then you can easily figure out what happens with the powers of A for example (a system theoretical motivation for linear algebra). Then you have the eigen spaces $$V_\lambda$$ consisting of all eigenvectors associated to the same eigenvalue $$\lambda$$. They provide a direct sum decomposition of V. This is a "better
" definition of diagonalisation I claim. Now, you can climb to the next floor and look at the operators (endomorphisms) of V (the Ruskis do this very well in their textbooks, better than American textbooks I am afraid - i am looking for some nice reference in English though ). So, you look at Projectors of onto ALONG the other eigenspaces . This means that is the identity on and equal to 0 on for
. Then you have that the sum of all eigenprojectors add up to the Identity of and so on. So, diagonalise A means to decompose the Identity operator of V in a way A would love... Maybe you should work this out (in matrix form if you like) for the baby example , in dimension 2, say.You may check “Linear algebra done right” by Axler: linear.axler.net also available at the Ruski site gen.lib.rus.ec. His viewpoint is close to ours
Awesome, that looks really helpful @Jorge Soto-Andrade ! It will take me a little time to work through this and hopefully understand it.
David Egolf said:
For example, if we have managed to diagonalize as for a diagonal matrix , how does this diagonalization induce a decomposition of the identity into eigen projectors?
An by diagonal matrix has eigen projectors which each has a 1 on the diagonal at and the rest of the matrix 0. Then you take as eigen projectors for .
It doesn't work out this nicely for infinite matrices. The projectors arise from
James Deikun said:
It doesn't work out this nicely for infinite matrices. The projectors arise from
- the fact that a matrix has eigenvalues, which can fail for infinite matrices;
- the fact that finite coproducts are the same thing as finite products in the category of vector spaces, which always fails for infinite vector spaces.
Yes, but then you go over to the spectral measure and get an integral instead of the discrete direct sum decomposition of V into eigenspaces.
John Baez said:
Todd Trimble recently told me a slick proof of a cool fact he learned from Charles Rezk on MathOverflow.
First a warmup fact. Suppose ρ is a representation of a group G on a finite-dimensional complex vector space. If g ∈ G then tr(ρ(g)) is a finite sum of nth roots of unity, where n is the order of g: that is, the smallest integer n > 0 with 𝑔ⁿ = 1.
Why? Well, we have ρ(g)ⁿ = 1. You can always diagonalize a complex matrix whose nth power is 1. (See why?) So do that. Then its diagonal entries have to be nth roots of unity. So tr(ρ(g)) is a sum of nth roots of unity.
Now for the cool part. If G is a symmetric group, then tr(ρ(g)) is an integer!
This is harder, but still quick. Let F be the field consisting of all rational linear combinations of nth roots of unity. If k is relatively prime to n there's an automorphism of F sending each nth root of unity to its kth power. Applying this automorphism to the entries of our diagonalized matrix ρ(g) we get the matrix ρ(g)ᵏ.
But for a permutation with gⁿ = 1, gᵏ has the same cycle structure as g whenever k is relatively prime to n. (Draw some examples). So gᵏ is conjugate to g. So ρ(g)ᵏ is conjugate to ρ(g). But the trace of a matrix doesn't change when you conjugate that matrix by something, so tr(ρ(g)ᵏ) = tr(ρ(g)).
We've learned that tr(ρ(g)) is a sum of nth roots of unity, but it also doesn't change when we apply the automorphism of F sending any nth root of unity to its kth power, whenever k is relatively prime to n. By some Galois theory, this forces tr(ρ(g)) to be an integer!
(Charles Rezk said this much faster since he assumed you know all the background. I've tried to minimize the amount of jargon. But we still need some Galois theory at the end - basically, the only algebraic integers in F invariant under the Galois group of F over ℚ are ordinary integers. Maybe there's some easy way to see this fact in the special case at hand.)
Nice remark. The key point seems to be that if you Galois conjugate a linear representation of the symmetric group, you get an isomorphic representation. I would not resist the temptation to claim that a "deep" proof of this is that follows from the well known fact that the symmetric group is $PGL(n, \mathbb F_1)$$ the general linear group over "the field with one element" (this is the famous "Boutade de Tits"). The irreducible representations of split into series parameterized by the multiplicative characters of the base finite field and its extensions. When you Galois conjugate, you change in general the parameter value. But for the field with one element all extensions collapse into the base field ("because" every power of 1 is 1) and you have no parameters.
Jorge Soto-Andrade said:
John Baez said:
Todd Trimble recently told me a slick proof of a cool fact he learned from Charles Rezk on MathOverflow.
First a warmup fact. Suppose ρ is a representation of a group G on a finite-dimensional complex vector space. If g ∈ G then tr(ρ(g)) is a finite sum of nth roots of unity, where n is the order of g: that is, the smallest integer n > 0 with 𝑔ⁿ = 1.
Why? Well, we have ρ(g)ⁿ = 1. You can always diagonalize a complex matrix whose nth power is 1. (See why?) So do that. Then its diagonal entries have to be nth roots of unity. So tr(ρ(g)) is a sum of nth roots of unity.
Now for the cool part. If G is a symmetric group, then tr(ρ(g)) is an integer!
This is harder, but still quick. Let F be the field consisting of all rational linear combinations of nth roots of unity. If k is relatively prime to n there's an automorphism of F sending each nth root of unity to its kth power. Applying this automorphism to the entries of our diagonalized matrix ρ(g) we get the matrix ρ(g)ᵏ.
But for a permutation with gⁿ = 1, gᵏ has the same cycle structure as g whenever k is relatively prime to n. (Draw some examples). So gᵏ is conjugate to g. So ρ(g)ᵏ is conjugate to ρ(g). But the trace of a matrix doesn't change when you conjugate that matrix by something, so tr(ρ(g)ᵏ) = tr(ρ(g)).
We've learned that tr(ρ(g)) is a sum of nth roots of unity, but it also doesn't change when we apply the automorphism of F sending any nth root of unity to its kth power, whenever k is relatively prime to n. By some Galois theory, this forces tr(ρ(g)) to be an integer!
(Charles Rezk said this much faster since he assumed you know all the background. I've tried to minimize the amount of jargon. But we still need some Galois theory at the end - basically, the only algebraic integers in F invariant under the Galois group of F over ℚ are ordinary integers. Maybe there's some easy way to see this fact in the special case at hand.)
Nice remark. The key point seems to be that if you Galois conjugate a linear representation of the symmetric group, you get an isomorphic representation, so that its character is a Galois invariant algebraic integer, so a rational integer. I would not resist the temptation to claim that a "deep" proof of this is that follows from the well known fact that the symmetric group is $PGL(n, \mathbb F_1)$$ the general linear group over "the field with one element" (this is the famous "Boutade de Tits"). The irreducible representations of split into series parameterized by the multiplicative characters of the base finite field and its extensions. When you Galois conjugate, you change in general the parameter value. But for the field with one element all extensions collapse into the base field ("because" every power of 1 is 1) and you have no parameters.
In a less metaphorical vein, the same Galois argument applies to show that the Gelfand character of any finite group is integer valued (the Gelfand character is the sum of all irreducible characters of , each one exactly once). Now, If you calculate the Gelfand character for the usual groups you always find non negative integer values. I conjectured for a while that this was always true (the Gelfand character looks very much like a permutation representation associated to a transitive $G$-space, for example its mean value is 1) until I stated this conjecture during a talk in Tokyo and the day after Yokonuma came to see me to apologise (with a deep bow) saying that last night he had calculated the Gelfand Character for the Mathieu group M11 and found negative values, so my conjecture was disproved... Gelfand himself told me later that he had made the same conjecture and asked his students to calculate the Gelfand character of every finite group in sight but that they were too lazy to do that ...
There are still several open questions related to Gelfand models. Gelfand's intuition was that they were representations with "very special properties". Notice that for a Gelfand model is provided by the natural unitary representation of in the Hilbert space whose decomposition into irreducibles involves the famous Legendre polynomials (as the kernels of the eigenprojectors as integral operators, in the sense of functional analysis). Here by eigenprojectors I mean the projectors associated to the decomposition of into a Hilbert direct sum of irreducibles (which are for sure eigenprojectors for some Laplacian-like intertwining operator).
David Egolf said:
John Baez said:
You can always diagonalize a complex matrix whose nth power is 1. (See why?)
I don't see why, but I would like to! I tried to figure this out below, but I got stuck.
I see two main ways, the "hand-crafted way" where you develop an argument specifically for this problem, or the "mechanical" way where you use the technology developed to solve many similar problems.
You're trying to diagonalize a complex matrix here. You can't diagonalize every complex matrix. How close can you come? This is what the Jordan normal form tells you!
For any linear transformation of a finite-dimensional vector space, you can find a basis such that the transformation is described by a block diagonal matrix of this form:
It has an arbitrary number of blocks of arbitrary size; each block has some fixed number running down the diagonal and 1's directly above the diagonal. All the other entries are zero.
The numbers are the eigenvalues of your transformation, and there's one eigenvector (up to scalar multiples) for each block.
(If two or more blocks have equal 's you get more eigenvectors by taking linear combinations of the eigenvectors for these blocks.)
This result is a power tool for studying linear transformations of a finite-dimensional complex vector space! So if someone tells you "I have a complex matrix whose nth power is 1", you can change basis to put it in Jordan normal form, and see what it must be like for its nth power to equal 1.
Being a lazy bum I reached for this "mechanical" method rather than trying to develop a hand-crafted method.
Thanks for explaining, @John Baez ! I've seen the Jordan normal form before, but have never been comfortable with it. So it's great to have an example where it's useful for something I'm trying to understand.
Let me see how this works. We have a linear map going from a complex finite-dimensional vector space to itself, , where . There exists a basis for so that can be described by a matrix in Jordan normal form. In that basis, for any integer , the linear map is described by the matrix .
I am guessing that if has any off-diagonal 1's, then will have off-diagonal non-zero entries as well, for all . I started trying to prove this, but it was trickier than I expected! To prove this though, I suspect it is important that must be invertible, so that has no zero entries on its diagonal. (And since , must be invertible).
If we assume this, however, then if , which has no off-diagonal non-zero entries, then we conclude that the Jordan normal form for must be diagonal. So, there exists a basis for in which the matrix describing is diagonal, which is what we wanted to show.
EDIT:
I think I understand why if the Jordan normal form matrix has any off-diagonal 1's, then for will have off-diagonal non-zero entries too. If has an off-diagonal 1, then for some basis vectors and (in the basis that generates the Jordan normal form) we have . If we keep applying to this result, we never get to map the to zero (because is invertible) and we never get to map the back into some amount of (by the structure of the Jordan normal form matrix).
At least that's my intuition. Looking at this again, I think there is some more work to do to upgrade this to a proof.
David Egolf said:
Thanks for explaining, John Baez ! I've seen the Jordan normal form before, but have never been comfortable with it. So it's great to have an example where it's useful for something I'm trying to understand.
Yes, I think you need to use it to appreciate it. Jordan normal form makes you the "king of linear transformations" - or at least king of any one linear transformation, since it lets you quickly settle many questions about a single linear transformation and polynomials .
I am guessing that if has any off-diagonal 1's, then will have off-diagonal non-zero entries as well, for all .
That's true if also has enough nonzero on-diagonal entries. Otherwise we get counterexamples like
which has .
But anyway, yes: this is the key to solving my puzzle, if you want to do it with Jordan normal forms.
I started trying to prove this, but it was trickier than I expected! To prove this though, I suspect it is important that must be invertible, so that has no zero entries on its diagonal.
Good! But that's more than you need. It's easiest to say what's going on for a matrix in Jordan normal form that has just a single block. If its diagonal entries are nonzero and it has at least one nonzero off-diagonal entry, all its powers also have nonzero off-diagonal entries... so none of its powers can equal 1.
But to get , the diagonal entries of can't be zero.
So, to get , the off-diagonal entries of must be zero.
So, any matrix whose nth power is for some , must be diagonalizable!
David Egolf said:
At least that's my intuition. Looking at this again, I think there is some more work to do to upgrade this to a proof.
The most important thing is to get some intuition for how to use Jordan normal form to do stuff... you can always rigorize your argument later.
But here are some extra tips! First, it's good to think about a single Jordan block since when you apply any polynomial to a matrix in Jordan normal form you just apply it to each block: the different blocks don't talk to each other.
Second, we can always write a matrix that's a single Jordan block as
where is the identity matrix and is a matrix with 1's right above the diagonal.
The matrix is what we call nilpotent: if we raise it to a high enough power we get zero!
It's good to take a matrix like this and square it, cube it, etc. to get an intuition for how this works, and when you get . This matrix is very important, since nilpotent transformations are important, and any nilpotent matrix has a Jordan form where all the Jordan blocks are of the form . (Of course this matrix I'm calling comes in various sizes.)
I guess I need to say: has all its nonzero entries one above the main diagonal, has all its nonzero entries two above the main diagonal, and so on.
So, what happens if we take a power of
?
We get things like
The matrix has all its entries above the main diagonal, so the different terms in these sums can't cancel each other!
So, if , all the powers of will have nonzero entries off the main diagonal.
This is a way of proving this claim:
It's easiest to say what's going on for a matrix in Jordan normal form that has just a single block. If its diagonal entries are nonzero and it has at least one nonzero off-diagonal entry, all its powers also have nonzero off-diagonal entries... so none of its powers can equal 1.
So, any matrix whose nth power is for some , must be diagonalizable!
That is very cool! It seems handy to know that different Jordan blocks in a matrix don't interact with each other when we multiply by itself or add to multiples of itself. Given this, it makes a lot of sense to study how a single Jordan block interacts with itself under matrix multiplication.
I wrote out some powers of for various sizes of . It looks like each multiplication of shifts all of its columns one to the right, bringing in a zero column on the left. This shift moves all the 1's one away from the main diagonal, and so we'd expect to have 1's above the main diagonal.
(By the way, I think ).
Using the binomial expansion theorem for , we see that (for ) this sum always contains a term, which has nonzero entries one above the main diagonal. And as you pointed out, all the other terms in this sum have nonzero entries at different "heights" above the main diagonal, so these nonzero off-the-main-diagonal entries in the term can't cancel out with the other terms. So, when (for ), really does have nonzero entries off the main diagonal, for each .
Now, assume we have a matrix in Jordan normal form, but with potentially more than one Jordan block. I will assume that different blocks do not interact with eachother under matrix multiplication or matrix addition. To take advantage of this, I write as , where each matrix contains the Jordan block of and is otherwise zero. Then I am assuming that and in general . Now, a particular is a matrix containing a single Jordan block of and a bunch of zeros. I think we can compute powers of just by computing the powers of the nonzero part of - call this - which is a single Jordan block of , and then adding back in all the extra zeros. If has a Jordan block containing 1's off the main diagonal and having a nonzero main diagonal, then powers of have nonzero entries off the diagonal too (by the discussion above). That means that powers of have nonzero entries off the main diagonal, and also has nonzero entries off the main diagonal.
David Egolf said:
It seems handy to know that different Jordan blocks in a matrix don't interact with each other when we multiply by itself or add to multiples of itself. Given this, it makes a lot of sense to study how a single Jordan block interacts with itself under matrix multiplication.
Yes! And perhaps I didn't shout this loud enough, though it's clear from what I said: to understand how a single Jordan block interacts with itself under addition and multiplication, we just need to understand this for the matrix having 1's directly above the main diagonal, since
and the rest is the binomial formula.
So, remarkably, we've reduced the study of polynomials of matrices to the case of this one matrix - or really one matrix for each !
This is the power and beauty of Jordan normal form.
It's also good to know that 'almost every' complex matrix is diagonalizable, in several very precise senses. So all the fancier Jordan blocks, beyond the 1 1 blocks, are 'rarely needed'.
However, 'rarely needed' is an exaggeration. For example consider differentiation acting on the space of complex polynomials of degree . In the right basis this linear operator is just our friend !
John Baez said:
However, 'rarely needed' is an exaggeration. For example consider differentiation acting on the space of complex polynomials of degree . In the right basis this linear operator is just our friend !
The matrix acts by taking in one basis vector and spitting out another one (except in the case where it produces zero). If you put in the basis vector, it spits out the basis vector (or zero).
If represents the derivative operator, then , , , , and so on. So, to represent differentiation of polynomials of up to degree using the matrix , I'm guessing we'd want our basis to look like . Neat!
Yes! And arguably this explains the appearance of in Taylor's theorem.