You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
I hope you’ve all been well! I’ve just gotten back from vacation where I had plenty of time to clear my head, which often left me thinking about category theory! I’ve got a lot of questions as such. Many of them have to do with fiber bundles which I want to get more into, so this thread will be all about that.
To start off, I want to know more about two ways I have of thinking about the definition of a fiber bundle, and we'll start in Set because Top might be a little more complicated. The usual definition starts with a "bundle", which is a surjective function E -> B that you can think of as a projection map. Because the map might not be injective, it's possible for the map to send a whole bunch of elements in E to a single element in B, with this bunch of elements existing "above B". That means there's a subset F -> E such that composing with E -> B yields a constant function F -> B. If every F for some surjective function is isomorphic, then this bundle becomes a fiber bundle and F is known as the "fibers" of the fiber bundle.
The other way to define a fiber bundle I call the "intuitive definition" because it's the way I first learned about them. In this definition, we start with only the base set B and define a fiber bundle to be an assignment operation that assigns a whole other set- the fiber- to every element of B. Rather then define a function into B, this defines one out of B, and into some "set of sets" where the elements might be our fibers. If we want to relate this back to E, then this set of sets might be P(E), the powerset of E. The function in mind then sends a point b in B to the subset e of E such that all points in e are mapped to b by the corresponding projection map from the usual definition of fiber bundle. In other words it does exactly what I want- it sends b to the fiber above it, assigning that fiber to b. While all fiber bundles and even surjective functions in general have such a construction, the opposite does not seem to be true. This is because an arbitrary function from B into P(E) may select the same elements in E to be mapped to two different ones in B either by failing to be injective or by having a subset in its image that is itself a subset of another subset in the image. In addition, we have to make sure we get back E by taking the union of all the subsets in the image, otherwise we might accidentally leave out some points in E we want to map into B.
Overall, after some reflection, it seems these two definitions of a fiber bundle are almost "inverses" of one another. My questions are: What kind of construction have I stumbled upon here? Secondly, would you consider both of these to be valid definitions of a fiber bundle, or would you rather view the second definition as a construction made possible by (and thus entirely a consequence of) the first definition? I'm excited to start this discussion of fiber bundles with you all!
The relation between these two constructions of 'bundle' is a somewhat decategorified version of the relation between Grothendieck fibrations and the Grothendieck construction.
Namely, given a set , and regarding it as a discrete category, there's a equivalence between:
This is a baby version of how for any category there's an equivalence (at the 2-categorical level) between
There are many other related correspondences. One of the more basic ones is the idea of the [[characteristic function]] of a subset: given a set , there's an equivalence between
(If you know about [[subobject classifiers]] you can say this last one in a more general way.)
Finally, I'll add that studying fiber bundles in rather than some category with 'cohesion' like or hides one of the most important features of fiber bundles: the 'local triviality' condition. Nonetheless it's good to warm up with .
This is the general theme of universes aka. classifying objects. Certain kinds of stuff that depend contravariantly on an object might be equivalent to maps where is a universe/classifier
Let me share a short exercise I did recently while learning about étale bundles and sheaves.
Let be a function in Sets. We have a functor
It turns out we also have a functor . It is best understood when we translate to bundles. Indeed, identifying , is obviously defined: for every , we get a bundle
[edit: I initially wrote , I fixed it; thanks @John Baez ]
More precisely, given , I have a bundle where . Then maps the fiber to . Translating back to , we obtain
should be the left (or right?) adjoint to .
[edit: I'm 50% sure about the notations and . There seems to be competing conventions.]
If and , what's ?
John Baez said:
If and , what's ?
oops, it's fixed now. I keep making mistakes when translating from "arrow order" to " order".
Okay, thanks for fixing that. What's confusing me still is that I'd always call that bundle the pushforward of the bundle along the map , while you're calling it , which is a standard notation for the pullback bundle, formed by pulling back along .
For example, see
This could be a case where notational conventions conflict, or it could be that you've got your and switched. I've often been blindsided by conflicting conventions in this area.
But anyway, your main point is great: the two viewpoints on 'bundles', giving the equivalence (not isomorphism)
provide usefully different outlooks for how to push forward or pull back bundles.
I think I've switched the notations, let me fix it too.
I'm actually afraid the conventions in topos theory may be opposite to the conventions in bundle theory! I don't want to think about it. :grimacing:
Sometimes, the world would be a better place without symmetry.
Thanks for the help! I liked all the insights about what is going on here, especially about how this construction resembles Grothendieck construction and about how this induces the equivalence between Set^X (functors into Set from X) and Set/X (the functions into X). But I'm not sure how generalizable this insight is, since I eventually do want to talk about Top and there isn't a notion of a Grothendieck-like construction for Top.
I thought a bit more about the "classifying object" perspective with the subobject classifier. There, you can find a close relationship between the concept of pullback, subset, and functions of form B -> {T, F}. In this case, it is all summarized in a single pullback diagram: the usual notion of subset S -> B is given by taking the pullback of B -> {T, F} and T: * -> {T, F}.
This got me thinking about what kind of classifier a powerset might be- that is, what generic functions A -> P(B) might represent (when it comes to assigning what elements in B are "over" some element of A by some map). First, we have to relax the conditions I mentioned above: if we relax the union condition, we are no longer in "function land" since we now are defining a map B -> A where not all points in B are mapped. If we relax the other condition, then this corresponds to a map B -> A where a single element of B can be mapped to multiple elements of A. In a sense, it seems that P(B) might be the classifier for binary relations from B to A. I think I'm on the right track because I remember seeing that profunctors, which are like categorifications of relations, can be thought of as functors into a presheaf category, which is like the categorification of a powerset.
This also leads me to wonder how pullbacks might factor in. First, a pullback can establish a subset of a product, which seems to define a binary relation in itself (and would make the powerset of A x B the set of all binary relations from A to B). Second, taking the pullback of a set at a point and a function into it invokes the "inverse image" construction of that function, which would be like a way of getting a fiber over an element from that element. Is there some pullback diagram that summarizes all of this in an easy way analogous to the one above? If not, then how do these concepts relate to one another?
John Onstead said:
Thanks for the help! I liked all the insights about what is going on here, especially about how this construction resembles Grothendieck construction and about how this induces the equivalence between Set^X (functors into Set from X) and Set/X (the functions into X). But I'm not sure how generalizable this insight is, since I eventually do want to talk about Top and there isn't a notion of a Grothendieck-like construction for Top.
There is; in fact there are a number of closely related such constructions. The simplest gives covering spaces of a space X, another gives fibrations over X (in the topological sense) with a chosen fiber, and another gives fiber bundles over X with a chosen structure group. In every case these are classified in terms of maps from X to some 'universal' entity. The technical details differ but the basic idea is the same... and the same as in the simpler examples I listed earlier here.
Some of the topological examples are sketched very sketchily in week233, near the end, where they are presented as a series of slogans in all caps.
Aha! I think I may have just answered a question I asked above... I know a relation is a function A x B -> 2, and so I was trying to see if this might have any connection with functions A -> P(B). But these are essentially two different worlds- a world of products and a world of powersets. After staring at this for a while it suddenly dawned on me that I actually have seen something like this before and that there is a way to bridge these two worlds, and it's through the process known as "currying". In this process you have a bijection of sets from the set of functions of form A x B -> C and the set of functions of form A -> C^B. If you just substitute C = 2 then you get exactly the connection I was looking for! So now I have a systematic way of getting from one definition to the other. If I start with a function B -> A, I can realize this function as a special kind of binary relation between A and B, and thus a function of form A x B -> 2. I can then curry this function to get a function A -> P(B) that does exactly what I need it to do.
The good news too is that this process of currying is very generalizable, so long as you have a cartesian closed structure on your category. Top famously does not, but certain subcategories of "nice" spaces do. But I'm not exactly sure how one might be able to do an analogous operation to what I did above in Set within even these nice categories, especially since they might lack a subobject classifier. Would you have any advice on how to move forward with this?
John Baez said:
In every case these are classified in terms of maps from X to some 'universal' entity. The technical details differ but the basic idea is the same... and the same as in the simpler examples I listed earlier here.
Some of the topological examples are sketched very sketchily in week233, near the end, where they are presented as a series of slogans in all caps.
I guess these "classifying objects" are the theme of the day! I'll look more into this, it seems very interesting, and maybe it can help me solve the above problem with trying to transfer my reasoning about bundles from Set into Top.
John Onstead said:
So now I have a systematic way of getting from one definition to the other. If I start with a function B -> A, I can realize this function as a special kind of binary relation between A and B, and thus a function of form A x B -> 2. I can then curry this function to get a function A -> P(B) that does exactly what I need it to do.
But I'm not exactly sure how one might be able to do an analogous operation to what I did above in Set within even these nice categories, especially since they might lack a subobject classifier.
I haven't read the whole conversation carefully, so hopefully this comment is still helpful and on-topic.
If you have a morphism , and you want to associate to each "point" of some subobject of , then I suspect you can do this using a pullback:
pullback diagram
The top right object in this diagram I've called , with the idea that a morphism from it to should correspond to some notion of "point of ". In , should be a singleton set. I don't know in which categories we have a notion of object, such that a morphism out of that object corresponds to a "point" of the target object. (In a monoidal category, I'd be tempted to set to the monoidal unit ).
At any rate, as long as is a monomorphism, then will also be a monomorphism - and hence it will specify a subobject of .
In this way, if is a point of and also a monomorphism, then one can associate to this point a subobject of . Notice that this map from (certain) points of to subobjects of depends on . Further, the subobject "hovers over" in the sense that . I'm not sure what additional conditions one should place on so that this really reflects the intuition of our induced subobject of "hovering over ".
I don't know if this really addresses what you are interested in doing! (But I find this idea interesting to think about, so I couldn't resist mentioning it!)
David Egolf said:
I haven't read the whole conversation carefully, so hopefully this comment is still helpful and on-topic.
Yes! Thanks for the insight. Indeed, a fiber in a fiber bundle can actually be specified via a pullback in exactly the way you describe. This kind of pullback is known as an "inverse image" operation. My goal is to try and find a function which encapsulates the consequence of performing such an operation, that is, to find a function out of the base space that sends a point to an element in another space (a "space of subspaces") that is supposed to represent the fiber over that point. My trouble is that the notion of a "space of subspaces" doesn't appear to be valid due to no categories of topological spaces, even the nice ones, being a topos or having a subobject classifier like Set. So, I'm finding it difficult to come up with an object that will serve as the codomain of the function I want to construct.
I've played around with options for a little bit. Here's the furthest I got. In a "true" fiber bundle, all fibers f are isomorphic to one another. So in a category of nice spaces I can find the homspace Hom(f, E) where elements are continuous functions from the fiber space into the total space. This includes all embeddings of fibers into the total space. Our function B -> Hom(f, E) would then be targeted at the elements of Hom(f, E) that are embeddings and thus that can be said to represent the fibers. I'm not sure if I'm going in the right direction so I will stop there for today. If anyone has any information on whether it's possible to define a "space of spaces" or any advice on what I could try next please let me know and I can look into it tomorrow!
@John Onstead wrote:
My trouble is that the notion of a "space of subspaces" doesn't appear to be valid due to no categories of topological spaces, even the nice ones, being a topos or having a subobject classifier like Set.
Wanting a classifier for all subobjects is indeed too much: for example in Top we have a monomorphism sending with its discrete topology to with its usual topology, and this makes the discrete real line into a subobject of the usual real line - a subobject that evades classification.
But in Top, the space with its codiscrete topology acts as a classifier for so-called 'strong' subobjects. A 'strong' subobject of a topological space is what we usually call a 'subspace' of . It's an isomorphism class of monos where is homeomorphic to with its subspace topology. That's not true in the counterexample I just gave. And here's the good news: any fiber of a fiber bundle is a strong subobject of the total space!
A quasitopos is a generalization of a topos that has classifier for strong subobjects. Now Top is not a quasitopos, but the problem is not the lack of a classifier for strong subobjects: it's that Top is not locally cartesian closed. There are various categories that solve this problem, which provide quasitopoi in which you can do to topology. Three examples are listed here.
I hasten to add that I'm not a professional quasitopos theorist, and I've never tried doing topology in a quasitopos. I've only done differential topology in a quasitopos, namely the quasitopos of diffeological spaces - and I wrote a paper with Alex Hoffnung explaining the rudiments of quasitopos theory and how they play out in this example (and others):
In particular we give a lowbrow proof that any category of 'concrete sheaves' is a quasitopos, and we describe what the strong subobjects are like, and the object that classifies these.
You can get "categories of spaces" that are topoi, it's just that you can't get them by starting with the classical notion of "topological space" and cutting down to a subcategory. Instead you have to enlarge (some subcategory of) topological spaces to a bigger category in which the missing classifying objects exist. For example, there is Johnstone's [[topological topos]] and the topos of [[condensed sets]].
Is * the name of that topos, or is there a footnote down below somewhere?
John Baez said:
And here's the good news: any fiber of a fiber bundle is a strong subobject of the total space!
That sounds like great news! I guess my construction can work after all in an appropriate quasitopos of nice spaces. In this case, I can define a "space of strong subspaces" Hom(E, {0,1}) and define functions into this of form B -> Hom(E, {0,1}). I think this solves my problem, thanks!
Mike Shulman said:
You can get "categories of spaces" that are topoi, it's just that you can't get them by starting with the classical notion of "topological space" and cutting down to a subcategory. Instead you have to enlarge (some subcategory of) topological spaces to a bigger category in which the missing classifying objects exist.
That's interesting! The topological topos looks especially cool, I suppose this could be another way of going about what I want to do!
I think this settles my questions on definitions of bundles and how they work on a high level. But I still have lots more questions about them, in particular how they connect to physics. I'll be back later to continue the discussion in this direction!
John Baez said:
Is * the name of that topos, or is there a footnote down below somewhere?
I think it's a topos*, in the same way that a lot of baseball players from the 90s have a * next to their names because of the steroids. Condensed sets aren't really quite a topos because they're sheaves on a really huge site, although there are lots of ways more-or-less-around this.
Interesting, thanks. A topos on steroids.
So fiber bundles are very important for physics because the very concept of a "field theory" can be expressed in terms of fiber bundles. This makes somewhat intuitive sense. A field theory is something that analyzes how you can assign some parameter to every point in physical space (or more generally physical spacetime). A parameter can be a scalar (like temperature or potential), vector (like velocity for fluid dynamics or a force field), vector-like object (like a magnetic field that uses the right hand rule), or even more complex objects like a spinor (for fermions) or tensor (in Einstein's theory). Regardless, you can reinterpret this analysis as finding the set of all parameters allowed under a field theory (the real numbers in the case of scalars, a vector space in the case of vectors, etc.) and then attaching this set to each point in your space(time). This turns out to be precisely the notion of a bundle under my "intuitive" definition above, with using physical space(time) as the base space. A field in a field theory is then a "section" of this bundle- a choice of a single value from the set of possible values for each point in the base space. So now I'd like to talk a little more about sections!
A section is simply a right inverse. So if you have the projection map for a fiber bundle E -> B, a section is a choice of a single element in a fiber over a point in B, for all points in B (or for all points in an open subset of B if you want to use local sections). So if you send an element in B under a section B -> E, and then project, you should go on a round trip back to that same element. This all seems straightforward.
My first question to get warmed up is about trivial bundles- that is, the canonical projection A x B -> A from the product space. A section of this bundle is equivalently a function from A to B, in the sense of being a "graph" of the function that sends x in A to (x, f(x)) in A x B. But there's a problem: there's already a notion of a "graph" of a function using pullbacks! This defines an object Graph(f) where the elements are pairs (x, f(x)) that embeds into A x B. This creates a problem: how are these notions related, and how can one convert from one into the other for any function from A to B (using only category theory concepts, of course)?
I gave this question an attempt and I found something maybe in one direction: for a function f: A -> B and corresponding section s: A -> A x B (still no idea for how to get from f to s), the image of s Im(s) might be the same as Graph(f). Graph(f) is then the object you factor the section s through in an (epi/mono) factorization, with the embedding Graph(f) -> A x B now reinterpreted as the "mono" of this factorization rather than a result of a pullback. I still feel I'm missing the bigger picture here so any help is greatly appreciated!
I gave this question an attempt and I found something maybe in one direction: for a function f: A -> B and corresponding section s: A -> A x B (still no idea for how to get from f to s),
The formula is .
(Remember, whenever you have morphisms , you get a morphism by the universal propery of the product.)
the image of s, Im(s), might be the same as Graph(f)
Yes, that should be true.
Thanks, this helps a lot. I had been trying to work with the projection part of the universal property of the product but I seem to have overlooked this part, thanks for the reminder!
I'll be back in a bit for my main question about defining sections in category theory.
Fiber bundles are important in field theory because they allow you to define functions that take in a point on a space and output something that isn't just a point in another topological space (like a vector, tensor, etc.) This might not seem too important because in material set theory, this is no problem. Just define a topological space, make the elements into vectors, and then have a function map into it in the usual sense. But in category theory you can't just declare elements of something to be of some other nature since technically there aren't elements at all, just maps (IE, generalized elements). So without bundles, you are stuck with only scalar functions (which are continuous functions that map into R) since there's no topological space of vectors, tensors, etc., and defining a function from a topological space to a vector space is meaningless because it results in a massive typing error. (I love category theory and find it a really useful common language for math, but if I had one problem with it it's this extreme strictness in how maps between objects can be defined without causing typing errors. Material set theory for sure wins here for its flexibility allowing you to define functions between any two, even completely different, kinds of math objects since they all are really just sets!)
But here's my problem: HOW exactly do sections resolve this typing error problem? If we take the naive definition of a section of a bundle E -> B, it's just a morphism B -> E that acts as a right inverse. This is just another continuous function, and again, it can only be at most a continuous function and not a vector-valued one since that would create a typing error. E does not inherently have any structure to indicate its elements are vectors, so it makes no sense to assume its elements are vectors. Usually we then work within the appropriate category of extra structure. But we run into a problem here: we aren't adding structure onto E, we are taking the whole thing E -> B, viewed as an object in C/B, and adding extra structure onto that to get to an object of VectBund(B). By smushing everything into a single object, the notion the bundle contains a function E -> B that you can even meaningfully take a section of seems to disappear. So my question is: how do category theorists define a section of a vector bundle without creating a typing error?
John Onstead said:
...defining a function from a topological space to a vector space is meaningless because it results in a massive typing error.
Let be a topological space, and let be a vector space.
Then to associate a vector to each point of , couldn't we use a function ? Here is the underlying set (of points) of the space and is the underlying set (of vectors) of .
Of course there's no guarantee that this function respects any of the structure of the topological space or the vector space . So maybe this is not what you are looking for!
I don’t follow this argument for the important of fiber bundles at all, actually. I think there are topological spaces of vectors, tensors, etc. What makes you say there aren’t?
Well, for one thing, for the ordinary notion of a "continuous vector field", we actually do treat as a topological space, not just a vector space, using the iterated product topology of the usual topology on the real line. Then we make that a vector space object in Top using a continuous scalar multiplication and vector addition map. The only thing that's not continuous here is the division on but there are ways around that, by defining a field in a way that doesn't use division, or just not caring as much about the difference between vector spaces and free modules on a ring.
In case it is helpful to note, I think you can view a topological vector space as a vector space internal to . Then in particular is a topological space and now we can talk about continuous maps , where is some topological space.
This is a more satisfying situation than what I sketched above, because now our function between points of and vectors of will respect the topological structure present.
[EDIT: I see @James Deikun already mentioned much of this above!]
Thanks for the help! I did consider using functions via the forgetful functor to Set, but that loses too much information about the structure of the objects I'm working with. I also considered using topological vector spaces too, but this doesn't work because a map of topological vector spaces isn't actually a continuous map, it's a continuous linear map. Only very few continuous maps are linear, so in general this cannot be used to define sections of vector spaces. There's creative ways of getting around this, like switching to a (co)differential category and using the (co)monad defined on that category to get nonlinear continuous maps between topological vector spaces, but I feel this is severely overcomplicating the situation.
I think I found something else that might help on this problem under the nlab article here. The problem is, it's difficult to understand what they are talking about so I don't know if this actually would work. They seem to imply a section can be defined in the category VectorBundle(B) after all (unlike my previous conception) as a homomorphism of vector bundles over B given by B x k -> E. They state that B x k is the trivial line bundle over B, thus making it into a vector bundle over B (somehow), but I don't know what k is nor do I know where this product is taking place. Second, E is not a vector bundle over B, but maybe they are using this to represent the actual vector bundle p: E -> B with all the added structure. Would someone know what these symbols mean and how this works?
John Onstead said:
They seem to imply a section can be defined in the category VectorBundle(B) after all (unlike my previous conception) as a homomorphism of vector bundles over B given by B x k -> E. They state that B x k is the trivial line bundle over B, thus making it into a vector bundle over B (somehow), but I don't know what k is nor do I know where this product is taking place.
k typically means 'the ground field': whenever you're talking about vector spaces or vector bundles you must have chosen a field k those vector spaces are modules over, called 'the ground field', typically or if you're doing topology.
Every space X has a trivial vector bundle over it with fiber k, called the trivial line bundle. This vector bundle has an obvious section, and vector bundle maps from this vector bundle to any other vector bundle E over X correspond bijectively to sections of E. It's a good exercise to figure out how this works.
Summarizing, we can say the trivial line bundle over X is the 'walking vector bundle over X equipped with a section'. Here 'walking' means it has the universal property described in my previous paragraph.
Second, E is not a vector bundle over B, but maybe they are using this to represent the actual vector bundle p: E -> B with all the added structure.
Yes, everyone does that. (I did it above.)
I continue to be confused what's bugging you, John Onstead. A section of a fiber bundle is a section of in the category of topological spaces, that's it. The fibers may well come equipped with e.g. vector space structure too but that's a separate issue.
Maybe the point is that a section of a vector bundle is nothing more than a section of the underlying fiber bundle? I'm sort of making up possibly-soothing words here because I don't really see the concern. You seem to be pretty committed to the idea that the value of a section at some point in the base space "is a vector" in some ontological sense that doesn't really seem meaningful to me.
John Baez said:
Summarizing, we can say the trivial line bundle over X is the 'walking vector bundle over X equipped with a section'. Here 'walking' means it has the universal property described in my previous paragraph.
Ah now this I can understand, it's just another basic application of representability (like with the classifying spaces we were talking about above). I think the best part about this is that these morphisms exist in VectorBundle(B) where all the structure is already defined, so the bijection is a good way to check to make sure that our vector valued functions (the morphisms in VectorBundle(B) out of the trivial line bundle) actually are corresponding with the sections of the underlying topological bundles in Top, B -> E. In a way, the section in Top B -> E is the "underlying morphism" of the morphism in VectorBundle(B) where we have "forgotten" all the vector-y things.
John Baez said:
Every space X has a trivial vector bundle over it with fiber k, called the trivial line bundle. This vector bundle has an obvious section, and vector bundle maps from this vector bundle to any other vector bundle E over X correspond bijectively to sections of E. It's a good exercise to figure out how this works.
I think I will try to give this exercise a shot so I can get a better understanding of what is going on here. My current guess is that it has to do with "E" in VectorBundle(B) being a "counterpart" in a way to E in Top, and B x k being a "counterpart" (in a canonical way) in VectorBundle(B) to B in Top. Therefore, just as the underlying section is a map B -> E, the section with structure in VectorBundle is B x k -> "E". But I'll do some work on this and report back later if this is true or not.
There's just one thing I'm still missing. How do you take the product B x k and which category does this product take place in? As far as I know, B is a topological space and k is a field. You can't take the product of two objects in two different categories (here, Top and Field) for basically the same reason I was mentioning earlier for why you can't define a morphism between two objects in different categories. You can define the Cartesian product U(B) x V(k) (where U and V are the forgetful functors to Set for Top and Field respectively) but then you've lost all the structure and are now left with sets with no structure on them.
Another way of looking at this: The trivial line bundle on is the free vector bundle on the (completely) trivial bundle (usually referred to metonymically as ). By the free-forgetful adjunction, bundle maps from to a vector bundle (usually referred to metonymically as ) correspond isomorphically to "plain" bundle maps from to the underlying plain bundle of , which in turn correspond isomorphically to sections of the underlying plain bundle of .
Kevin Carlson said:
You seem to be pretty committed to the idea that the value of a section at some point in the base space "is a vector" in some ontological sense that doesn't really seem meaningful to me.
I guess that would be a good way to describe it. Maybe I just have a weird philosophy of math! But in my philosophy, it only makes sense to call an element of a set a "vector" if and only if there is an explicitly defined vector space structure on that set. Without this structure, or if we are dealing with the extra structure separately (and thus the structure is not explicit in our work), I feel the only way to make the elements of the set vectors is to impose an external interpretation on the Set that the elements are meant to represent vectors. But this interpretation comes from outside of mathematics and so I don't see it as mathematically meaningful or rigorous. But again this is just how I think of math, maybe I've learned it in the wrong way?
James Deikun said:
Another way of looking at this: The trivial line bundle B×k on B is the free vector bundle on the (completely) trivial bundle idB (usually referred to metonymically as B). By the free-forgetful adjunction, bundle maps from B×k to a vector bundle p (usually referred to metonymically as E) correspond isomorphically to "plain" bundle maps from idB to the underlying plain bundle of p, which in turn correspond isomorphically to sections of the underlying plain bundle of p.
Wow this is really helpful thanks for the insight! I had a hunch there was something "free functor-y" about the construction B x k but this helps make that explicit. I'll have to dive into this info a bit more but at first glance maybe I wasn't so far off in my initial impression that you can think of the sections of the plain bundle as "underlying" in some sense the bundle maps from B x k -> p, thanks to the involvement of the forgetful functor as part of this free-forgetful adjunction.
The Cartesian product happens between the topological space and the underlying topological space of the topological field , in . However, like when referring to plus all its vector bundle structure as , there is a lot of other structure hidden in the notation:
James explained it quite nicely. Let me just add that the nLab writes for a topological field mainly so they don't have to separately discuss the two cases that people care about: and . We could work with more general topological fields, but > 99% of work on vector bundles in topology and differential topology is about and , so those are what you should be thinking about if those are the subjects you are trying to learn.
In algebraic geometry we talk about vector bundles for many other fields, but the whole framework is different. Perhaps you could unify it with the other subjects (using tricks like thinking about the underlying topological space of a scheme, and perhaps the discrete topological field on a field) but I'm reluctant to broaden this conversation to cover algebraic geometry!
John Onstead said:
I guess that would be a good way to describe it. Maybe I just have a weird philosophy of math! But in my philosophy, it only makes sense to call an element of a set a "vector" if and only if there is an explicitly defined vector space structure on that set. Without this structure, or if we are dealing with the extra structure separately (and thus the structure is not explicit in our work), I feel the only way to make the elements of the set vectors is to impose an external interpretation on the Set that the elements are meant to represent vectors. But this interpretation comes from outside of mathematics and so I don't see it as mathematically meaningful or rigorous. But again this is just how I think of math, maybe I've learned it in the wrong way?
The irony of this from my side is that you're expressing the perceived lack of rigor in a way which is itself not rigorous! I think that if you tried to pin down what you feel is missing in precise mathematical terms you'd likely find your complaint dissolves. It's already been said in this thread but in case it's worth stating in another way, a precise categorically-minded definition of "a vector bundle over with fiber " for some topological vector space is a map of topological spaces with homeomorphisms for open sets covering the pullback (in the category of topological spaces!), the forgetful functor from topological vector spaces to topological spaces, and such that the transition functions land in the image of the general linear group of under the forgetful functor So, in particular, the interpretation of elements of as vectors is a part of the specification of a vector bundle. In most mathematical writing these details about and will be notationally elided, but they're there. It seems to me like that ought to be more than enough rigor to resolve the concerns you're expressing, if I've understood you?
To add to Kevin's clarification: finite-dimensional vector spaces over R and C have a unique topological vector space structure, and 99% of the time 'vector bundle' is used to mean finite-dimensional vector bundle: people don't say 'finite-dimensional vector bundle', it's taken for granted.
If someone is using an infinite-dimensional vector bundle it's their responsibility to say so and to specify the topology on the fibers. See for example [[Banach bundle]].
Ok, I've chewed over this a little bit more and I think I've come to some understanding of what is going on so I can finish this exercise. First, as James Deikun mentioned, there's an adjunction between VBund(B) and Top/B. An adjunction gives rise to a bijection of Hom-sets of the form HomC(c, U(d)) ~ HomD(F(c), d). In this case, C = Top/B and D = VBund(B), while c = idB and d = E. Plugging all this in we get the bijection Hom_Top/B (idB, U(E)) ~ Hom_VBund(B) (F(idB), E). We know F(idB) ~ B x k so we get at last the bijection Hom_Top/B (idB, U(E)) ~ Hom_VBund(B) (B x k, E). This precisely tells us that morphisms from B x k to some vector bundle E are in bijection with morphisms from idB to the underlying bundle of E in Top/B. From this we see the adjunction precisely gives rise to the representability property (the "walking" nature) of B x k that John Baez mentioned above too.
We can then continue on by understanding what a morphism idB -> U(E) is doing in Top/B. A morphism in a slice category over B is just the morphism between the domains of the two morphisms into B that make the triangle commute. In this case, we have the objects B and E as the domains, so idB -> U(E) will be a morphism B -> E such that B -> E then U(E): E -> B commutes with the identity. This is precisely the definition of a section. So we find indeed that we can extend the bijection of hom sets above to have Hom_Top/B (idB, U(E)) ~ Hom_VBund(B) (B x k, E) ~ SectionsOfU(E).
I think I understand everything from this abstract picture, then. But if the original exercise I was given wasn't about this abstract stuff that I already understand well, but to try and prove something like the fact that this adjunction exists in the first place, or to prove the correspondence between B x k -> E and sections B -> E without category theory, then I think I'm out of luck! I promise I'm learning about this stuff in the background, it's just that I'm just a very slow learner- after all I barely passed my basic high school calculus class even with lots of tutoring, and it did take me a full 6 months to finally understand what a universal property was doing!
James Deikun said:
The Cartesian product B×k happens between the topological space B and the underlying topological space of the topological field k, in Top
This makes sense. I also see how the first projection onto B gives the structure of a trivial bundle, since any product and its canonical projection A x B -> A is a trivial bundle. I'm a little confused on how the fibers of the projection have the structure of a one dimensional space. What if we are working with the topological field C of complex numbers? Doesn't that space have two dimensions, the real and imaginary axis?
Also I'm a little confused about where these "topological rings" are located. Are they in the category Ring(Top) or the category Ring(Top/B)? If it's the latter then it makes sense because you need a ring object internal to Top/B so that you can define a module object internal to Top/B, such as a vector bundle. But generally the term "topological ring" is reserved for objects of Ring(Top), not Ring(Top/B). Hence my confusion.
Kevin Carlson said:
It seems to me like that ought to be more than enough rigor to resolve the concerns you're expressing, if I've understood you?
I think I understand where you are coming from, but let me just make sure. It seems you are using the forgetful functor U as a way of "putting aside" the extra structure. So by denoting a topological space as U(V) rather than as B, E, etc., you are explicitly indicating via the notation that we are meant to take U(V) as the underlying topological space of some topological vector space, and that this structure is on hand should we need it later. In other words, let's say U(V) was the topological space R^3. It seems then that the object/notation U(V), in a sense, contains "more information" than if we just wrote down R^3, even though they are meant to refer to homeomorphic topological spaces. Please let me know if I'm on the right track!
Last bit of update, I'm currently looking into spaces of vector bundle sections. I think that will be the next topic once we've wrapped up all the loose ends here, since I have quite a bit of questions about them!
@John Onstead wrote:
But if the original exercise I was given wasn't about this abstract stuff that I already understand well, but to try and prove something like the fact that this adjunction exists in the first place, or to prove the correspondence between and sections without category theory, then I think I'm out of luck!
Interesting. You are really learning math through category theory, instead of what many people, which is learn math and then learn category theory.
If asked to prove there's a correspondence between vector bundle maps and sections of the vector bundle , I would never think of using category theory: I'd just write down the definition of each thing and notice how they correspond.
What's a vector bundle map ? It's a linear map for each point , varying continuously with . Here is the fiber of over , which is a vector space.
But this linear map is determined by , since is 1-dimensional with as a basis vector.
So to have a linear map for each point , varying continuously with , is the same as to have an element for each , varying continuously with .
But the latter is a section of .
That's what I'd say if asked to prove this result. There is, however, a lot of background knowledge assumed here. For example: what exactly do I mean by "continuously varying with ", both for elements of and for linear maps ? Why does one vary continuously with iff the other does? And so on.
These are things one learns when studying bundles and vector bundles.
So, I can imagine that if I was unfamiliar with these concepts, yet somehow knew a lot of category theory, I might try to leverage my knowledge of category theory to give a very different proof, like you did.
Something like:
The forgetful functor from vector bundles over to fiber bundles over has a left adjoint which replaces the fiber of a bundle with the free vector space on that fiber... somehow given a topology, I'm not sure how! (This is just a strategy for a proof.)
Applying the functor to the terminal fiber bundle over , namely , we get the trivial line bundle .
The terminal fiber bundle is the 'walking fiber bundle over with a section': morphisms from this to any other fiber bundle correspond to sections of .
'Therefore', by how left adjoints preserve universal properties involving morphisms out of an object, the trivial line bundle is the 'walking vector bundle over with a section': morphisms from this to any other vector bundle correspond to sections of .
This is very sketchy and it seems @John Onstead already worked it out more efficiently with fewer questionable leaps of logic.
It's an interesting approach - step 1 brings up a question I've never thought about, namely "what's the free topological vector space on a topological space?" (Luckily this is not hard to answer for the only case we really need, the one-point topological space.)
But as it stands, I'd never actually take this approach unless there were some pressing reason to do so.
John Onstead said:
I'm a little confused on how the fibers of the projection have the structure of a one dimensional space. What if we are working with the topological field C of complex numbers? Doesn't that space have two dimensions, the real and imaginary axis?
To make a vector space, you need to choose three things:
Intuitively, the more "powerful" the scalars are, the lower the dimension of the resulting vector space.
For example, you can create a vector space as follows:
In this case, we'll find that the resulting vector space is two dimensional. One basis, for example, is and we can write any complex number as where .
However, we can also create a vector space like this:
In this case, the resulting vector space is one dimensional! One basis is and we can write any complex number as where .
So, to be able to say what the dimension of a vector space is, one needs to know what the scalars are for that vector space! If we only know what the vectors are, that's not necessarily enough information to determine the dimension of the vector space.
A less pedagogical way to say what David just said: any field is a 1-dimensional vector space over itself.
But the fact that we commonly use two fields in mathematics, and , causes a lot of interesting phenomena. For starters, any n-dimensional vector space over has an underlying 2n-dimensional vector space over . This leads to other things: we can define a concept of [[complex manifold]], and any n-dimensional complex manifold has an underlying 2n-dimensional real manifold. (A real manifold is usually called just a manifold!)
Digging in slightly deeper: there is a forgetful functor
and this has a left adjoint called [[complexification]]:
These are incredibly potent throughout math and physics; I could talk about this for hours.
Here's a cool fact: complexification is not only the left adjoint to the forgetful functor , it's also the right adjoint! We say these functors are [[ambidextrous adjoints]]. This has all sorts of consequences.
But coming back to @John Onstead's original question: any complex vector bundle with n-dimensional fibers has an underlying real vector bundle with 2n-dimensional fibers. So you can't talk about the dimensionality of the fibers without specifying the field.
Conversely, any real vector bundle with n-dimensional fibers can be complexified, fiberwise, to give a complex vector bundle with n-dimensional fibers.
John Baez said:
But the fact that we commonly use two fields in mathematics, and , causes a lot of interesting phenomena. For starters, any n-dimensional vector space over has an underlying 2n-dimensional vector space over . This leads to other things: we can define a concept of [[complex manifold]], and any n-dimensional complex manifold has an underlying 2n-dimensional real manifold. (A real manifold is usually called just a manifold!)
What happens in the infinite-dimensional case, such as for the vector spaces and ?
People who study infinite-dimensional real and complex manifolds want to base the theory not on mere vector spaces but topological vector spaces, typically locally convex topological vector spaces (cf [[Banach manifold]] and [[Frechet manifold]]). I have never seen anyone use for this, since that's not a vector space differential geometers talk about. I guess it could be made into a locally convex topological vector space....
John Baez said:
Interesting. You are really learning math through category theory, instead of what many people, which is learn math and then learn category theory.
Yes this was my original goal with learning CT. Since CT involves patterns across all fields of math, I thought learning it would help me more quickly to understand what is going on in each other field of math. However it seems it isn't always this straightforward. I believe I saw a video on CT where they made the analogy that if math was like a book, CT would be the setting of the book, but there's other stuff in addition to the setting like the plot, characters, etc. that might be hard to track from the "zoomed out" perspective of the setting alone.
John Baez said:
So to have a linear map fb:k→Eb for each point b∈B, varying continuously with b, is the same as to have an element fb(1)∈Eb for each b, varying continuously with b.
I see. I think the information I was missing was that a vector space map for a 1d space is determined by fb(1). I vaguely remember learning that a matrix's action could be represented by what it did to the basis vectors of a space, but I hadn't thought to apply that here. But you are also right- I don't know about the "continuously varying" property and what this means. But I'm looking into it right now!
David Egolf said:
However, we can also create a vector space like this:
- the vectors are elements of C
- the scalars are elements of C
- the scalars multiply the vectors using the usual multiplication we have in C
In this case, the resulting vector space is one dimensional! One basis is {1} and we can write any complex number as a1 where a∈C.
This is really helpful, thanks so much!
I also think I can finally see now where we get the vector bundle B x k from. First, we form the actual topological bundle B x U(k) -> B, where U(k) is a topological space we are meant to take as underlying some topological field k. Then we define an extra vector space structure onto this object in Top/B such that every fiber has the structure of the field k as a vector space, given that any field is a 1-d vector space over itself. This ensures that every fiber will be 1-d, as well as giving us a way of reintroducing a notion of "the field of real numbers" for example into the world of Top/B.
John Onstead said:
John Baez said:
Interesting. You are really learning math through category theory, instead of what many people, which is learn math and then learn category theory.
Yes this was my original goal with learning CT. Since CT involves patterns across all fields of math, I thought learning it would help me more quickly to understand what is going on in each other field of math. However it seems it isn't always this straightforward.
You seem to be progressing very rapidly. Suppose an undergraduate and graduate degree in math typically takes about 9 years. It may take you just 6 years of hard work to get an equivalent understanding of the core branches of pure math - like set theory and logic, topology, differential geometry, algebra, algebraic geometry, analysis, and probability theory.
That's an extremely rough estimate, of course, and it depends a lot on what you do, but I'm impressed by your progress!
I believe I saw a video on CT where they made the analogy that if math was like a book, CT would be the setting of the book, but there's other stuff in addition to the setting like the plot, characters, etc. that might be hard to track from the "zoomed out" perspective of the setting alone.
That's true. To learn math well, we always need to keep leaving our comfort zone and taking new perspectives. For some people it takes a lot of courage to break free from detailed studies of individual topics and think about things in a broader way using category theory. For you it may take some courage to dive into details and read textbooks that present subjects without mentioning category theory. You may find yourself translating their treatment into category theory, which is always a good exercise - but you may also get better at understanding subjects "on their own terms", without the mediation of category theory, and that too is good.
John Onstead said:
I see. I think the information I was missing was that a vector space map for a 1d space is determined by . I vaguely remember learning that a matrix's action could be represented by what it did to the basis vectors of a space, but I hadn't thought to apply that here.
Yeah, that's an important fact all the time in linear algebra. Here's a fancy way to think about it: to say a vector space has basis is the same as to give a specific isomorphism where
is the "free vector space on a set" functor. Thus for any vector space we get isomorphisms
where
is the right adjoint of , the "underlying set of a vector space functor".
So, to specify a linear map is the same as to specify a vector in for each element of the basis !
John Onstead said:
I think I understand where you are coming from, but let me just make sure. It seems you are using the forgetful functor U as a way of "putting aside" the extra structure. So by denoting a topological space as U(V) rather than as B, E, etc., you are explicitly indicating via the notation that we are meant to take U(V) as the underlying topological space of some topological vector space, and that this structure is on hand should we need it later. In other words, let's say U(V) was the topological space R^3. It seems then that the object/notation U(V), in a sense, contains "more information" than if we just wrote down R^3, even though they are meant to refer to homeomorphic topological spaces. Please let me know if I'm on the right track!
Yes, that sounds basically right. To give a vector bundle with fiber involves giving a topological vector space together with a fiber bundle whose fiber is , in other words.
Thanks for the help so far! As mentioned I wanted to move into the territory of discussing spaces of sections of bundles. The reason for this interest is that in physics, the two main ways of expressing a physical theory are in terms of fiber bundles (in which case a physical state is a section of this bundle) and state spaces like phase space. Being able to easily generate a space of sections of a bundle then instantly gives you a translation tool to get from the world of physical theories in terms of bundles to the world of physical theories in terms of state spaces.
The most obvious first step is to get the set of bundles, which is just the hom set we were discussing above such as Hom(idB, E) in the slice category over B, or Hom(B x k, E) in the vector bundle category. But recently I learned about a new way to think about "objects of sections" known as a dependent product. In a sufficiently nice category you can define a "dependent product" as the adjoint to the base change. So if you have a morphism A -> B, you can get a functor C/B -> C/A called the "base change" that you find via pullback, and the adjoint to this C/A -> C/B gives you a dependent product along that morphism. If B is the terminal object and the morphism you choose is the canonical morphism into the terminal object, this corresponds to a base change C/* ~ C -> C/A. The notion of an "object of sections" is the adjoint to this when it exists C/A -> C/* ~ C that sends a bundle over A to the "object of sections" of that bundle.
An interesting fact I learned about the dependent product is about how it relates to Cartesian and closed categories. If you have a category with a terminal object and pullbacks, this allows you to define the base change C -> C/A for any A, which actually sends an object B in C to B x A -> A in C/A. This is a reflection of the property that any category with a terminal object and pullbacks will also have products (and thus be a Cartesian category!) If in your category this base change always has the object of sections adjoint C/A -> C for any A, then you will always be able to define objects of sections of the map B x A -> A. But from our discussion above we realized that sections of B x A -> A correspond to maps A -> B! Thus, such a category will not only be Cartesian but Cartesian closed, with the internal hom Hom(A, B) given by this dependent product!
Anyways, I was excited to share my findings because I find dependent products very interesting. But now it's time to get to business on my questions. First, it seems easy to define a "space of sections" where we are just talking about a topological bundle. We just have to find a cartesian closed category of spaces (we discussed this above) and use the dependent product as I mentioned above Spaces/A -> Spaces that sends a bundle over A to the space of its sections.
However, the situation gets more messy when we extend to talking about spaces of sections of vector bundles. In the simplest case where we want a topological space of vector bundle sections, we really don't care about the extra structure since it doesn't matter if the points of a topological space are "supposed" to be vectors, so the above construction is just fine. But oftentimes, the set of sections of a vector bundle has a vector space structure on it. So my question is, how does a set of sections of a vector bundle get this vector space structure? Is there some sort of enriched category thing where VectBund(B) is enriched in RVect, thus allowing a hom functor of form Hom(B x R, E) to send you straight into the vector space of sections of E? Or is there a way to do this using dependent products of some form? Thanks again for your help!
Oh, that’s funny, I was so down on the importance of your typing worries but now they’re really coming back with a vengeance!
What we’re probably going to need to do here is making the categories of real vector bundles over a base into a fibration over the category of spaces.
I don’t know whether you know what a fibration is, but it’s enough to give a “functor” from spaces into categories, mapping to the category of real vector bundles over (and vector bundle maps).
Pulling back bundles almost gives a contravariant functor; there’s a picky detail that pullbacks are only functorial up to isomorphism, so that really this is a pseudofunctor.
Anyway, I think you’ll find that the pullback functor from vector bundles over a point into vector bundles over has a right adjoint, which is your space of sections, as a vector bundle over the point, ie a vector space, desired structure saved!
I don’t think you’ll find dependent products for fiber bundles or for vector bundles, on the basis that spaces aren’t locally Cartesian closed. But maybe some more of them exist, I don’t really know.
Kevin Carlson said:
Anyway, I think you’ll find that the pullback functor from vector bundles over a point into vector bundles over X has a right adjoint, which is your space of sections, as a vector bundle over the point, ie a vector space, desired structure saved!
Wow this is really cool! It's interesting that analogues to the base change/dependent product construction can work in more general settings than just slice categories over an object. The connection to Grothendieck constructions makes this more clear as well!
Kevin Carlson said:
I don’t think you’ll find dependent products for fiber bundles or for vector bundles, on the basis that spaces aren’t locally Cartesian closed. But maybe some more of them exist, I don’t really know.
I might be able to find a "nice" category of spaces where they exist, such as the quasitopos discussed above.
Yeah, you might, but god knows what a fiber bundle over a pseudotopological space is! (I didn't check whether pseudotopological spaces are the quasitopos you mean but the same point probably goes.) The nice thing about the merely cartesian closed convenient categories of spaces is that they (some of them anyway) are actually subcategories of the usual category.
John Onstead said:
Wow this is really cool! It's interesting that analogues to the base change/dependent product construction can work in more general settings than just slice categories over an object. The connection to Grothendieck constructions makes this more clear as well!
Yes, it is really cool! This is opening a huge can of worms around categorical semantics of dependent type theory; if you're into that, you might imagine that you have a category of contexts and a fibration of types-in-context over that category; pullback is then context extension and left- and right- adjoints to pullback are and -types.
Vector bundles are very widespread in physics and are arguably the most important structures you can put on a bundle. Not only do they include vectors in the traditional sense, they also include tensors, spinors, and multivectors (and pseudovectors) since all those other things also form vector spaces. Now that we have a way of expressing vector bundle sections and dealing with vector bundles in a categorical way, we can deal in an efficient way with these most common of bundle structures, and can discuss vector, tensor, spinor, and multivector fields all at the same time. And it was all thanks to the protagonist and hero of our story, the adjunction between VectBund(B) and Top/B!
But on reflection, it almost seems like a stroke of luck that this happened to all work out in the end, and that we caught a really lucky break in that there just so happened to be this adjunction. While it's certainly quite the relief, I can't help but have the nagging feeling that we might not get so lucky when considering other notions of structure on bundle. Indeed, many other kinds of bundles exist, including principle bundles, group bundles, affine bundles, and so on. It might be possible for one of these categories of structured bundles to not have an adjunction with Top/B, in which case we fail to find a corresponding object to B x k whose morphisms would correspond to sections of this kind of structured bundle!
So here's my question: are there any categories of important structured fiber bundles without this kind of adjunction (that is, does the above nightmare scenario ever come true)? Is there some theorem telling us when an adjunction between Top/B and a category of structured bundles over B exists? And lastly, is it still possible to define a notion of "section" within a category of structured bundles even without this adjunction (perhaps by finding related categories where there is an adjunction)? Thanks!
Since I've never tried to think about sections in terms of adjoint functors until you raised the topic, I can't answer any of these questions. I just work with sections "by hand".
I see... Well, I'm wondering if this problem may be attacked from the universal algebra perspective. A vector space object is a model of a Lawvere theory, so a vector bundle is a model of a Lawvere theory in Top/B. The other structures commonly put on fiber bundles I mentioned also seem like they should have some Lawvere theory corresponding to them too. Now, the category of models of a Lawvere theory in Set will always have an adjunction, but the problem is, I'm pretty sure this is only defined for Set due to the correspondence between Lawvere theories and monads. So I'm not sure if this result generalizes to categories beyond Set like Top/B. Is there any useful theorems about when Lawvere theories give rise to adjunctions outside of Set?
Edit: Actually I don't know if vector spaces have a Lawvere theory since fields don't, but modules certainly do.
A vector space object is a model of a Lawvere theory, so a vector bundle is a model of a Lawvere theory in Top/B.
A vector bundle over does indeed give a model of the Lawvere theory for vector spaces in - or 'vector space object in ', for short. Unfortunately the concept of 'vector space object in ' omits the 'local triviality' assumption that people always make when dealing with vector bundles (because all the really interesting theorems make use of that assumption).
E.g., I could make a topological space over that the disjoint union of the spaces and , with projection . This would be a vector space object in , but not a vector bundle over .
Here the fiber over is for negative and for positive . But I could go wild and make the fiber be over rational numbers and over irrational numbers; I could still get a vector space object in .
So one question is whether you want to spend time expressing vector bundles in the language of category theory (which is a fun and worthwhile activity) or learning about vector bundles (which is also a fun and worthwhile activity). There's a relation between the two activities, and if you can ever get the first one to work they'd eventually merge, but they start out being somewhat different.
If you lean toward the former (as seems clear), a good challenge would be understanding the concept of "local triviality" in categorical terms. This concept has different manifestations for vector bundles and fiber bundles, and also for principal -bundles, but they're all similar enough that there should be some categorical generalization.
For the particular question, you don't exactly need a whole left adjoint from whatever-bundles to spaces over ; you just need the functor of sections to be representable on whatever-bundles. This is tautologically going to be a job for a whatever-bundle freely generated by one section, which somewhat less precisely is always going to be an appropriate kind of trivial bundle. For principal -bundles over , this is ; for fiber bundles tout court, this is just the identity map on . I can't immediately think of a flavor of bundle that doesn't allow this representability result, but it's an interesting thing to wonder.
John Baez said:
A vector bundle over B does indeed give a model of the Lawvere theory for vector spaces in Top/B - or 'vector space object in Top/B', for short
Ah that's interesting there is a Lawvere theory for vector spaces when there isn't one for fields! Good to know for future reference.
John Baez said:
Unfortunately the concept of 'vector space object in Top/B' omits the 'local triviality' assumption that people always make when dealing with vector bundles (because all the really interesting theorems make use of that assumption).
Ah, for some reason I keep forgetting the local triviality condition. Though this does make me wonder if local trivialization can be included as an additional axiom in the Lawvere theory for vector space objects, or if one would need a stronger theory (such as a finite limits theory) to write down this condition. I'm also confused about is how local triviality is different between topological and vector bundles. Isn't a locally trivial vector bundle just a locally trivial topological bundle with the extra structure on top?
In any case, according to the nlab article on vector bundles the local triviality is defined by some isomorphism of vector bundles, but interestingly this takes place in the category of vector space objects in Top/U for U some open cover of the base space. I'm not sure how you'd fit all this information into Top/B, which you'd somehow have to do to avoid the information necessary to define a locally trivial vector bundle from being scattered across multiple different categories. The categorical generalization might then be to set an analogous isomorphism in some category of structured objects in Top/U (and maybe in the slice category Top/U directly in the case of a normal topological bundle)
John Onstead said:
Ah that's interesting there is a Lawvere theory for vector spaces when there isn't one for fields! Good to know for future reference.
Vector spaces over a given field are described by a Lawvere theory, since a vector space is characterized by operations (addition, subtraction, zero, multiplication by for each ) obeying solely equational laws. A field is not.
Vector spaces over a field were my first exposure to a Lawvere theory that required 'shitloads' of unary operations - one for each element of the field.
Kevin Carlson said:
This is tautologically going to be a job for a whatever-bundle freely generated by one section, which somewhat less precisely is always going to be an appropriate kind of trivial bundle. For principal G-bundles over B, this is B×G; for fiber bundles tout court, this is just the identity map on B. I can't immediately think of a flavor of bundle that doesn't allow this representability result, but it's an interesting thing to wonder.
This makes sense, and I'm certainly glad that this representability appears to be a common result. It seems logical that for most well behaved notion of bundle there's a trivial version in some form. I think I'll just need to convince myself that these can represent sections. For instance, it makes sense for vector bundles with the trivial line bundle, but it seems to work only because a vector space map is entirely determined by where it sends the basis vectors, and the trivial line bundle has only one dimension per fiber meaning only one basis vector. Other kinds of structure may not have maps entirely determined by a single element. I'll certainly be looking more into this as I learn about the other kinds of structures on bundles!
John Baez said:
Vector spaces over a field were my first exposure to a Lawvere theory that required 'shitloads' of unary operations - one for each element of the field.
This is very interesting! I think I see what you mean. The "division" is taken care of within the field by having the field contain a set of multiplicative inverses. Then, the vector space only needs to care about scalar multiplication, with "scalar division" just being if it happens to multiply by the multiplicative inverse of some other element of the field. But if this is the case, the theory works just the same with a general ring, the only difference is that the ring we are using just so happens to be a field.
Yeah, there's no specially privileged operation of 'scalar division' for vector spaces over a field: a vector space is just a module over a ring that happens to be a field.
John Baez said:
If you lean toward the former (as seems clear), a good challenge would be understanding the concept of "local triviality" in categorical terms. This concept has different manifestations for vector bundles and fiber bundles, and also for principal -bundles, but they're all similar enough that there should be some categorical generalization.
I wonder if such a categorical generalization of "local triviality" could also encompass our requirement that manifolds locally look like some . (Maybe manifolds are too different from bundles, though?)
They're somewhat different, but maybe a sufficiently elevated outlook can unify them.
I don't know much about vector bundles, but it feels like in the finite-dimensional case they should correspond to finite-dimensional vector spaces in Sh(B).
I predict that finite-dimensional vector bundles on a topological space give some but not all the finite-dimensional vector spaces in the topos .
Consider the sheaf of finite-dimensional vector spaces on where if does not contain some chosen point and if . I expect that this counts as a finite-dimensional vector space in . But it's not the sheaf of sections of a vector bundle on .
It might depend a bit on what you mean by [[finite set]]. What is the dimension of your ?
In algebraic geometry, n-dimensional vector bundles over a scheme correspond to locally free -modules of rank n.
(In this situation I believe the topos of sheaves over is a [[ringed topos]] where the ring object is called , so it make sense to talk about a module of this ring object. Maybe makes sense for a module of any ring object in any topos to be "locally free", though I don't really know how to define the concept at that level of generality: I just know how to define it in the case mentioned above.)
If we're talking about real vector bundles on a space , then we have the sheaf of continuous maps (which is also the Dedekind real numbers object in ), and it seems we should be similarly talking about modules over that ring object. And for any fixed external natural number we have a free module , and it seems to me like the local freeness condition makes sense: there is an open covering of on each element of which the module is isomorphic to . And that looks like it should be equivalent to an -dimensional vector bundle, since is the sheaf of sections of the trivial -dimensional vector bundle.
But if we try to make "finite-dimensionality" internal to Sh(X) then I expect we end up with some more exotic things.
I mean "free on a Bishop-finite set" in the internal logic. (And yes, internal vector spaces over , of course.)
John Baez said:
I predict that finite-dimensional vector bundles on a topological space give some but not all the finite-dimensional vector spaces in the topos .
Consider the sheaf of finite-dimensional vector spaces on where if does not contain some chosen point and if . I expect that this counts as a finite-dimensional vector space in . But it's not the sheaf of sections of a vector bundle on .
I guess that is a "sheaf of finite-dimensional vector spaces", but it isn't a finite-dimensional vector space in Sh(B) in a very reasonable sense, since the generating set is only subfinite.
This suggests that my inclination was correct: https://mathoverflow.net/questions/268836/is-the-theory-of-vector-bundles-just-linear-algebra-done-in-a-suitable-topos
I had this question in the pipeline but now's as good a time as any to ask it given David's question above.
Differentiable manifolds and vector bundles have a lot in common. Both involve taking some space, decomposing it, and assigning a vector space structure onto each component. In the case of differentiable manifolds this is assigning the structure to the charts/open subsets, and in vector bundles this is assigning the structure to each fiber. The two notions converge in the case of the tangent bundle, which is a vector bundle canonically assigned to a differentiable manifold.
It seems you can really do most “differential-y” things in a differentiable manifold with only reference to the tangent bundle, which really makes me wonder: is the usual definition in terms of charts and atlases giving more information than is needed, and all you need really is just the tangent bundle? Formally, is a topological manifold M equipped with the extra structure on it of a designated vector bundle (maybe satisfying some property) TM -> M, which we define/specify to be its tangent manifold, sufficient to define a differentiable manifold? That is, is a vector bundle sufficient to encode all of the differentiable structure on a topological manifold? Why or why not? If so: why not just use this as a definition since it seems less cumbersome than the atlas one; if not: give examples of things you can do with a differentiable manifold you couldn’t do with just knowing the tangent bundle.
Graham Manuell said:
John Baez said:
I predict that finite-dimensional vector bundles on a topological space give some but not all the finite-dimensional vector spaces in the topos .
Consider the sheaf of finite-dimensional vector spaces on where if does not contain some chosen point and if . I expect that this counts as a finite-dimensional vector space in . But it's not the sheaf of sections of a vector bundle on .
I guess that is a "sheaf of finite-dimensional vector spaces", but it isn't a finite-dimensional vector space in Sh(B) in a very reasonable sense, since the generating set is only subfinite.
Oh, very cool! I had thought of the distinction between 'finite' and 'subfinite' sets as one of those cute things that constructive mathematicians to, but here it's really hitting home because it's touching on something I care about!
(It must be because I'm a mathematical physicist that finite-dimensional vector bundles seem more like 'real life' than finite sets. :smirk:)
I'm still confused by this claim by Ingo Blechschmidt](https://mathoverflow.net/a/268974/2893): nontrivial finite-dimensional vector bundles give free finite -modules as viewed internally in the topos . It's as if 'free' automatically manages to mean 'locally free'. If this is true this is also really cool.
John Onstead said:
I had this question in the pipeline but now's as good a time as any to ask it given David's question above.
Differentiable manifolds and vector bundles have a lot in common. Both involve taking some space, decomposing it, and assigning a vector space structure onto each component. In the case of differentiable manifolds this is assigning the structure to the charts/open subsets, and in vector bundles this is assigning the structure to each fiber.
I would not say we're putting a vector space structure on each chart of a smooth manifold . Each chart is a smooth, not linear, isomorphism for some open set . It's true that in some contexts is considered a vector space, but is not preserving the vector space structure, only the smooth manifold structure.
So it would be very wrong to think of a manifold as a space covered by vector spaces. On the other hand, each fiber of a vector bundle is really a vector space: you can add and subtract points, multiply them by scalars, etc.
John Baez said:
I would not say we're putting a vector space structure on each chart of a smooth manifold M. Each chart is a smooth, not linear, isomorphism ϕα:Uα→Rn for some open set Uα⊆M. It's true that in some contexts Rn is considered a vector space, but ϕα is not preserving the vector space structure, only the smooth manifold structure.
Hmm, now I'm genuinely very confused. What are we doing differently with a smooth manifold than with the usual topological manifold if R^n is to be interpreted as just a topological space in both scenarios? By the wording it seems the only difference is that the isomorphism to R^n is "smooth" in the case of a smooth manifold (instead of being merely continuous in the case of topological manifolds), but this is circular reasoning since you already need the notion of a derivative and thus smooth manifold to define a smooth morphism in the first place! Also, where does the vector space structure come in that we ultimately need in order to do any sort of actual differentiation on the differentiable manifold? Is it purely in the tangent bundle?
37 messages were moved from this topic to #theory: category theory > Vector bundles vs internal vector spaces in a topos by Kevin Carlson.
ario@_John Onstead|673117 said:
John Baez said:
I would not say we're putting a vector space structure on each chart of a smooth manifold . Each chart is a smooth, not linear, isomorphism . It's true that in some contexts is considered a vector space, but is not preserving the vector space structure, only the smooth manifold structure.
Hmm, now I'm genuinely very confused. What are we doing differently with a smooth manifold than with the usual topological manifold if is to be interpreted as just a topological space in both scenarios?
It's not being interpreted as just a topological space in both scenarios. If it were, my sentence "each chart is a smooth isomorphism " would make no sense, since there's no such thing as a smooth map into a mere topological space.
By the wording it seems the only difference is that the isomorphism to is "smooth" in the case of a smooth manifold (instead of being merely continuous in the case of topological manifolds), but this is circular reasoning since you already need the notion of a derivative and thus smooth manifold to define a smooth morphism in the first place!
Indeed. You need to have defined smooth manifolds before you're allowed to say the things about them that I just said. I wasn't defining them, just talking about them!
If you want the definition of smooth manifold click the link and set , but here's a key fact: in a smooth manifold the transition functions are required to be smooth, in a topological manifold they're only required to be continuous. This then allows us to define smooth maps to and from open subsets of a smooth manifold in the usual way... but only continuous maps to and from open subsets of a topological manifold.
I think I see how this works. A smooth manifold is defined in terms of smooth maps, and by the link given, smooth maps are defined in terms of smooth maps between open subsets of Euclidean space. These are then in turn outright defined using the limit definition of a derivative in Euclidean space. This still does seem circular since you're just pushing the problem of having to define things (smooth maps, derivatives, and smooth manifold structure) down the road from smooth manifolds to Euclidean space, but I can kind of see how this might work if you instead work from Euclidean space as a full inner product space rather than as a smooth manifold. So it seems one path you might take is first define Euclidean space as an inner product space, then define the derivative on Euclidean space, then you can derive the concept of a smooth map between open subsets of Euclidean space, then define a smooth map, and then you can define the concept of a smooth manifold. The last step is then to realize that Euclidean space itself is a smooth manifold. This gets rid of the circularity since you don't have to already assume Euclidean is a smooth manifold in order to define the concept of a smooth map into it, you just have to assume it's an inner product space. But that's how I am rationalizing it, maybe there's better ways of getting around the circularity.
Admittedly, this is still a little weird since I'm used to defining a structure preserving map in terms of the structure it preserves rather than the other way around. Hearing a smooth manifold get defined in terms of smooth maps almost felt like being told "a group is a set G equipped with a group homomorphism G x G -> G such that..." But unlike with groups I guess there are some contexts in which you can do things backwards and define the map before the object.
I saw you had written something about math having an existing infrastructure that category theory is just imposing itself around or something like that. But I can relate to this because even though my explanation above gets rid of the circularity, it still doesn't morally work since once again it cannot be done in category theory. This is because a smooth map is defined in terms of a continuous function into Euclidean space that satisfies the differentiability condition. But this cannot work in CT because you are defining a continuous function (a map in Top) into an inner product space where you can define the differentiability condition (an object in InnerProduct). Thus you are attempting to define a map of type Top into an object of type InnerProduct, which results in a typing error. I doubt there's an adjunction or representability that saves us here either, since InnerProduct is quite a different category compared with Top.
This still does seem circular since you're just pushing the problem of having to define things (smooth maps, derivatives, and smooth manifold structure) down the road from smooth manifolds to Euclidean space, but I can kind of see how this might work if you instead work from Euclidean space as a full inner product space rather than as a smooth manifold.
Inner products have nothing to do with it.
Here's how it goes, though this is a poor substitute for reading a book about manifolds:
When we explain smooth manifolds, we first make sure people know what it means to take partial derivatives of a map where are open sets. The inner product plays no role here.
Then we say is smooth if you can take as many partial derivatives as you want and they're all continuous, like
is continuous.
Then we say a smooth manifold is a topological space with an open cover and homeomorphisms called charts such that the transition functions are all smooth. (Figure out what open sets are the appropriate domains and codomains for these transition functions.)
So there's nothing circular about it and we're done while the category theorists are still revving up their engines. :upside_down:
John Baez said:
So there's nothing circular about it and we're done while the category theorists are still revving up their engines.
This is humorous, because it certainly seems true. Based on previous questions I asked on here and in general looking up resources on the web, it seems that the subjects of category theory and analysis just don't get along. Ironically this means category theory can explain abstract concepts from the highest levels of mathematics with ease, but finds itself in a bind when describing basic high school calculus or undergrad multivariable calc as per your example!
I'd still like to share why I brought up inner product spaces and my thought process behind it. From what I understand, to define a notion of "derivative", you need a space with sufficient "extra structure" to do so. This even includes Euclidean space; the reason we can learn calculus in high school without knowing about manifolds is because while the structure is still there and is always there, it is swept under the rug (made implicit) for pedagogical reasons.
From what I can see, there's two ways (probably more, but two ways I know of) of defining this sufficient "extra structure" on Euclidean space to be able to define a derivative. One way is to equip Euclidean space with an inner product space structure (or minimally some sort of topological vector space structure, but this comes naturally with the inner product space structure). The other way is to equip Euclidean space with a differentiable/smooth manifold structure. My point was that the former option is the only valid one since the latter introduces a circularity where you need the concept of a smooth manifold (Euclidean space) to define the concept of a smooth manifold!
Yes, this conversation is making me glad I learned category theory only after I learned analysis and differential geometry :upside_down:
By the way, what you need to define derivatives of functions out of is the vector space structure on , nothing about inner products. For example, given we define
if the limit exists., where with a in the th place.
More invariantly, given a finite-dimensional vector space and a vector , the derivative of in the direction at a point is
if the limit exists. You'll see the key thing we need is to multiply the vector by the scalar and then add it to : just what you can do in a vector space!
So we start in calculus by treating 's as vector spaces to define smooth maps between them. Then we create a category of 's and smooth maps between them, or open subsets of 's and smooth maps between them. Then we use that to create a category of smooth manifolds and smooth maps between them.
At least this is one standard approach.
A somewhat slicker approach is to define a category of [[diffeological spaces]] and smooth maps between them... but again, after we know about 's and smooth maps between those.
John Baez said:
You'll see the key thing we need is to multiply the vector v by the scalar e and then add it to x: just what you can do in a vector space!
So we start in calculus by treating Rn's as vector spaces to define smooth maps between them. Then we create a category of Rn's and smooth maps between them, or open subsets of Rn's and smooth maps between them. Then we use that to create a category of smooth manifolds and smooth maps between them.
Ah this makes sense and puts it all into perspective, thanks!
I think I'll leave this here for today since I want to get back on track with fiber bundles, and I have a lot more to learn about them especially with a concept I want to bring up tomorrow. Though I'd really like to continue a discussion on manifolds another time! I've also really been thinking about this typing problem with category theory (IE, to address how we can make a category with topological vector spaces as objects and smooth maps between them that are not linear) and I think I might have a general solution to it, but it requires double category theory. I may get into this in another thread at some point too!
Isn't it the purpose of synthetic differential geometry ? (to define axiomatically what it means to be smooth)
Peva Blanchard said:
Isn't it the purpose of synthetic differential geometry ? (to define axiomatically what it means to be smooth)
I think you're right! That would be another good way of defining a space with sufficient structure for differentiation. I like the approach of synthetic differential geometry too since it deals explicitly with defining differentiation and other tools of calculus in terms of infinitesimal quantities, rather than with more vague notions of "focusing in on a point" such as what is given by limits.
The next important thing to cover is a connection on a fiber bundle like a vector bundle. Whenever I've tried to learn fiber bundles before, this has always been the point I was forced to stop since I could no longer figure out anything that was going on and where the math exponentially got more complicated. It seems a connection is a way to "parallel transport" along a vector bundle; I've also seen it described as a way to compare between two local coordinate systems.
But my first main question is: what even IS, ontologically, a connection? Is it "extra structure" to be added onto a vector bundle? That is, can I construct a category VectorBundleWithConnection(B) such that there's a forgetful functor to VectBund(B) that forgets the structure of the connection, just like any "extra structure" in mathematics? Or is it some vague abstract nebulous thing that will be tough to precisely pin down? Secondly, where can a connection be defined? Only on smooth manifolds, or can you do it with general vector bundles?
Thirdly, I tried taking a look at the nlab page for connection and my eyes glazed over. They explained it in a few different ways, yet none of these ways did anything to help me. Let's say some guy who has never taken any advanced math classes (basic high school calculus was his last math class) and who only understands the concept of object, morphism, category, functor, universal property, and stuff structure property came up to you and asked you what a connection is. Would you be able to explain it to them in a simple way purely in terms of those concepts? Thanks!
John Onstead said:
Let's say some guy who has never taken any advanced math classes (basic high school calculus was his last math class) and who only understands the concept of object, morphism, category, functor, universal property, and stuff structure property came up to you and asked you what a connection is. Would you be able to explain it to them in a simple way purely in terms of those concepts? Thanks!
Mmh, I don't know if there is such an explanation; if there is, it does not seem to be mainstream. Maybe you are expecting a short path where there is not. From a pragmatic point of view, I'd suggest to study mainstream differential geometry first. But I'd be interested in knowing that a categorical description of connections exists.
Probably the closest you can get to something like that is Kock's A Combinatorial Theory of Connections
Obviously this isn't going to get you all the way. To make precise the connection between this stuff and the usual definition you need access to a synthetic differential geometry. But this definition is pretty explicit and combinatorial, and informally shows you "what's really going on" in a nice way
@John Onstead
But my first main question is: what even IS, ontologically, a connection? Is it "extra structure" to be added onto a vector bundle? That is, can I construct a category VectorBundleWithConnection(B) such that there's a forgetful functor to VectBund(B) that forgets the structure of the connection, just like any "extra structure" in mathematics?
Yes, and this forgetful functor is faithful, so the connection is really extra structure, not extra stuff.
Or is it some vague abstract nebulous thing that will be tough to precisely pin down?
All math I don't understand yet seems vague, abstract and nebulous, but after I put in the effort to learn it, it magically becomes clear, crisp and remarkably concrete.
Secondly, where can a connection be defined? Only on smooth manifolds, or can you do it with general vector bundles?
That's not the right dichotomy! You can define a connection on any smooth fiber bundle over any smooth manifold, or on any smooth vector bundle over any smooth manifold.
(I'm saying the word 'smooth' twice as often here as people usually say it, just to be super-clear: as soon as you say 'over any smooth manifold' the convention is that you're working in the category where everything is smooth.)
Thirdly, I tried taking a look at the nlab page for connection and my eyes glazed over.
Don't try to learn it there. Learning math online has serious limitations. Start reading books! For example, read a bit of my book Gauge Fields, Knots and Gravity. I explain connections for vector bundles. This book is known for being understandable.
You just need to look at one section of the book to learn what a connection is.
When I was learning about manifolds, bundles and connections, I greatly enjoyed this:
This is not a book you read cover to cover: it has definitions of hundreds of things, and clear theorem statements, and proofs, in a very well-organized way. There are a number of equivalent but different-looking definitions of 'connection', and they go through a bunch of the most important ones, not just for vector bundles but for more general fiber bundles and also principal bundles (which are very important too).
Let's say some guy who has never taken any advanced math classes (basic high school calculus was his last math class) and who only understands the concept of object, morphism, category, functor, universal property, and stuff structure property came up to you and asked you what a connection is. Would you be able to explain it to them in a simple way purely in terms of those concepts?
Sure! But having written a paper that does it, I'm too tired to do it here.
Read this:
But beware: this category-friendly approach to connections is not a replacement for learning about connections in the usual way: you'll know one nice viewpoint on what a connection is, but you'll still be unable to understand most people when they talk about connections. What they say will still sound "vague, abstract and nebulous" (even though it's not).
To really learn connections, I recommend the relevant sections of Analysis, Manifolds and Physics.
Chris Grossack (they/them) said:
Obviously this isn't going to get you all the way. To make precise the connection between this stuff and the usual definition you need access to a synthetic differential geometry. But this definition is pretty explicit and combinatorial, and informally shows you "what's really going on" in a nice way
Thanks, I'll read through this!
John Baez said:
Yes, and this forgetful functor is faithful, so the connection is really extra structure, not extra stuff.
This helps a lot in pinning down what a connection "is". Now I just need to learn more about what this extra structure is!
John Baez said:
That's not the right dichotomy! You can define a connection on any smooth fiber bundle over any smooth manifold, or on any smooth vector bundle over any smooth manifold.
(I'm saying the word 'smooth' twice as often here as people usually say it, just to be super-clear: as soon as you say 'over any smooth manifold' the convention is that you're working in the category Diff where everything is smooth.)
Ok, so if I'm understanding correctly, a smooth bundle is a bundle p: E -> B in Diff rather than Top. So I guess when dealing with sections and extra structure on smooth bundles, we are working in the category Diff/B rather than Top/B as before (but hopefully everything is still analogous)!
John Baez said:
Read this:
- John Baez and John Huerta, An invitation to higher gauge theory, Section 2: Categories and connections.
To really learn connections, I recommend the relevant sections of Analysis, Manifolds and Physics.
Will check those out, thanks! I'll keep updated on if I have any questions going through this material.
John Baez said:
For example, read a bit of my book Gauge Fields, Knots and Gravity. I explain connections for vector bundles. This book is known for being understandable.
I'm actually heading in the direction of gauge theory in this discussion, so it's good to know someone here literally wrote the (or a) book on gauge theory!
John Onstead said:
John Baez said:
You can define a connection on any smooth fiber bundle over any smooth manifold, or on any smooth vector bundle over any smooth manifold.
(I'm saying the word 'smooth' twice as often here as people usually say it, just to be super-clear: as soon as you say 'over any smooth manifold' the convention is that you're working in the category Diff where everything is smooth.)
Ok, so if I'm understanding correctly, a smooth bundle is a bundle p: E -> B in Diff rather than Top. So I guess when dealing with sections and extra structure on smooth bundles, we are working in the category Diff/B rather than Top/B as before (but hopefully everything is still analogous)!
Right.
And also note I said fiber bundle, so local triviality is assumed when discussing connections.
I'm going through some of the sources recommended and I keep seeing a mention of a "covariant derivative". This notion of a derivative depends on the connection and seems to be how you do derivatives on smooth manifolds (when you have some connection on the tangent bundle of the smooth manifold). But remember yesterday when we defined a derivative using the vector space structure on Euclidean space and the limit definition of a derivative? So I'm a little confused about why all these derivatives are now popping up. I guess my question here would be: how much of a covariant derivative is due to the structure added by a connection, and how much of it is due to the differentiable structure on the manifold the tangent bundle is defined over? And secondly (perhaps as part of the first question), why do covariant derivatives and Euclidean vector space derivatives behave so similarly that we'd even give them the same name of "derivative"?
A 'connection' on a vector bundle is a well-behaved rule for taking derivatives of smooth sections of that bundle. For historical reasons these derivatives are called 'covariant derivatives'.
Without a connection, it is impossible to take derivatives of smooth sections of a vector bundle. This is why connections are essential.
However, a trivial vector bundle has a god-given obvious connection so you can always use that one if you want.
And secondly (perhaps as part of the first question), why do covariant derivatives and Euclidean vector space derivatives behave so similarly that we'd even give them the same name of "derivative"?
They both obey versions of the two key rules for derivatives: linearity and the product rule.
Heck, they both are derivatives: a derivative is a way of measuring how rapidly something is varying.
Finally, the usual derivative of a smooth real-valued function on a manifold is a covariant derivative! A smooth function is the same as a smooth section of the trivial line bundle . Its covariant derivative using the god-given obvious connection I alluded to is the same as its ordinary deriative.
(This connection is not obvious until you know what a connection is, of course - but then it is.)
John Baez said:
Finally, the usual derivative of a smooth real-valued function on a manifold M is a covariant derivative! A smooth function f:M→R is the same as a smooth section of the trivial line bundle M×R→M. Its covariant derivative using the god-given obvious connection I alluded to is the same as its ordinary deriative.
(This connection is not obvious until you know what a connection is, of course - but then it is.)
That's quite the amazing connection (pun intended)! Now things are making more sense.
Thanks for all the help so far, I've learned so much! I think I'll take a break from this for a few days so I can review what I've learned and also so I can study connections in more detail. I will be back to discuss gauge theory in the context of mathematical physics soon. Before that, I was reviewing my notes and I wanted to clarify my understanding of local triviality of vector bundles:
First, I want to confirm local trivialization is a property and not a structure. That is, given the category of vector bundles VectBund(B) and vector space objects in Top/B Vect(Top/B), there's a fully faithful embedding VectBund(B) -> Vect(Top/B) on all vector space objects satisfying local trivialization. I'm confused on this since it seems like it should be a property, but the nlab definition gives it in terms of an open cover, but there can be multiple different open covers for some space. Secondly, the way to connect the abstract definition of a vector bundle (as a vector space object in Top/B) with the usual fiberwise definition is to show that the vector space structure "distributes" over all fibers in the bundle. This seems to follow directly from the local trivialization condition, at least going by the nlab page (since the local trivialization condition implies the fiberwise pullback). Does this mean the vector space structure always fails to "distribute" in a fiberwise manner when this condition is not satisfied?
First, I want to confirm local trivialization is a property and not a structure.
To use the terminology in a very precise way, we should say a local trivialization is a structure, and the existence of such a structure is a property called locally trivializability.
Then, what ordinary topologists called a 'locally trivial' fiber bundle is really a locally trivializable fiber bundle, not a locally trivialized fiber bundle. We haven't chosen local trivializations: we just know that we can.
I'm not saying we should fight against topologists who say 'locally trivial'; I'm just saying that it means 'locally trivializable'.
Similarly, topologists often say 'trivial fiber bundle' when they mean 'trivializable fiber bundle' - one for which a trivialization exists. This is different than a trivialized fiber bundle.
Topologists also talk about a 'local trivialization', and this is a structure: a specific choice of local trivialization, perhaps in a neighborhood of a point or perhaps in every open set of a chart. (They will say.)
Hi! I'm back. I reviewed some resources on connections and I think I have some sense of what they are doing at least for principal bundles. But to be honest, there's too many concepts I'm being bombarded with while learning this field to be covered even by a semester class on differential geometry, let alone via asking questions here, which is quite unfortunate as I'm very curious about it all. But as such, I want to do things bit by bit, and to do this, I wanted to explore gauge theory by "building up" classical (relativistic, but certainly non quantum) electromagnetism.
Fortunately, there's an extremely streamlined formulaic step by step procedure for how to build up any gauge theory. I have a vague conceptual sense of what this entails: first we find the continuous symmetries of our system to make into a Lie group (in EM, this is U(1)), then choose a base space for our principal bundle (for relativistic EM, I think we'd need some sort of U(1)-principal bundle over Minkowski space), then discuss what our gauge and gauge symmetries are (phase and phase shifts for EM), then define the appropriate connection, then something about representations (I think this might be needed to define an "associated vector bundle"), then define the gauge field associated to the connection (the EM potential), then the curvature which is often a tensor (the EM field tensor), then the conserved quantity via Noether's theorem (electric charge), then the PDE that relates the fields and conserved quantities all together (Maxwell's equations), and finally determine the corresponding lagrangian mechanics of the theory. While I have passing familiarity with these steps, I need to "fill in" all the rigorous mathematical detail, which is what I intend to do by going over all these steps in gauge theory here!
I think I already understand the first few steps well. First things first, we need to define a "principal bundle" for our gauge theory. A principal bundle, as I understand it, is the device we need to introduce the concept of "symmetry" into our physical system, and is the primary reason why symmetry plays such an important role in physics (in everything from gauge theory to Noether's theorem). It does so by first taking the fundamental object we use to study all symmetries in math, a group, and then applying one to each point in our base space as the fibers of the bundle. More specifically, we make the group a Lie group, a group with a smooth manifold structure, and so our principle bundle is automatically a smooth bundle. As a smooth bundle, we can now define a connection on it, which will enable us to unlock the rest of the gauge theory. This connection is given by a 1-form, although "where" this 1-form is located (which bundle it's a section of) is still confusing me (but more on that later).
Since I think I understand what the principal bundle and Lie groups are doing, I'll skip right to better understanding sections of the principal bundle. My first question is to do with "gauge transformations". Let's say we take a section of the principal bundle (not sure what to call this since "gauge field" is reserved for something else we will get to later); we can alter this both globally and locally without changing anything physical, we are only changing our arbitrary local coordinates at each point. So the gauge transformation seems to have something to do with an "action" of the Lie group, and potentially even a choice of action of the Lie group for a number of points in our base space (or for all of them if this is a global gauge transformation). My question is then: what is the specific mathematical relationship between the Lie groups, actions of the Lie groups, gauge transformations both local and global, and transformations of the sections of a principal bundle?
I'm really delighted that you're interested in physics, because it provides a useful counterweight to category theory. So do many other subjects, but physics is probably my favorite.
If you truly know what a principal bundle is, you'll know what the category of principal -bundles over a manifold is. Then a gauge transformation is precisely an automorphism of principal -bundle .
John Baez said:
Then a gauge transformation is precisely an automorphism of principal G-bundle P
That's an interesting perspective! Although, I was under the impression that a gauge transformation was a transformation on a section of the principal bundle rather than the whole thing itself. Maybe these two perspectives can be related if one can define a section as a morphism into the principal bundle, in which case a gauge transformation of that section is the composition of that morphism with the automorphism?
Speaking of which, how do you define a section of a principal bundle category theoretically? For vector bundles, we found that the trivial line bundle produced morphisms corresponding to sections, which ultimately was due to the fact that a single basis vector determines the whole linear map. The 1d vector space then can be said to represent points. In the category Grp, the object that represents points are the integers (I believe), but I'm not so sure this holds in LieGrp since integers aren't continuous objects. So as a result, I'm not sure which principal bundle is the one that represents sections. Which one does represent sections and why?
John Onstead said:
John Baez said:
Then a gauge transformation is precisely an automorphism of principal G-bundle P
That's an interesting perspective! Although, I was under the impression that a gauge transformation was a transformation on a section of the principal bundle rather than the whole thing itself.
No, definitely not. But of course gauge transformations, being automorphisms of principal bundles, act on sections of principal bundles - and also sections of every bundle 'associated' to the principal bundle, in the technical sense of 'associated bundle'.
Indeed, the main use of of a principal -bundle is to functorially build an associated bundle with fiber over from any space on which acts. Then gauge transformations automatically act on this associated bundle, and on its sections.
We especially do this for vector bundles, which we get from vector spaces on which acts.
Speaking of which, how do you define a section of a principal bundle category theoretically?
I assume we both know what a section of a fiber bundle is.
Any principal bundle has an underlying fiber bundle, and a section of that principal bundle is defined as a section of its underlying fiber bundle.
It's just like vector bundles: any vector bundle has an underlying fiber bundle, and section of that vector bundle is defined as a section of its underlying fiber bundle.
A mathematician who doesn't use forgetful functors a lot will be crippled. They'll be wasting time wondering stuff like "what's an element of a vector space?" - because a vector space is not a mere set.
John Baez said:
Indeed, the main use of of a principal G-bundle P→B is to functorially build an associated bundle with fiber F over B from any space F on which G acts. Then gauge transformations automatically act on this associated bundle, and on its sections.
I see the virtue in defining a gauge transformation as a principal bundle automorphism then! But how is this supposed to connect with the usual definition of a gauge transformation as specifying an element of the Lie group at each point? If the automorphism acts fiberwise, then it is essentially defining an automorphism on each individual Lie group that comprises each fiber in the bundle. Is somehow an automorphism on a Lie group related to an element inside of that Lie group? If so, how? And do Lie group actions come into play here, since actions are generally how we interpret an element of a group as a transformation?
Addendum: Wikipedia actually defines a principal bundle in terms of a Lie group action. This is different to the way I learned about principal bundles, which were as bundles such that every fiber was a Lie group. Maybe my difficulty here is in seeing how these two notions are related, if at all.
The automorphism which sends a point to element of the Lie group over which your bundle is principal acts on the bundle by multiplying the fiber over by . For instance, consider the torus as an -principal bundle; the automorphism determined by the function on twists the fiber at by the angle , so that it leaves two opposite fibers untouched and the two halfway between those get rotated through radians, etc.
You can do the same thing even for nontrivial bundles because the fiber is always a torsor for the structure group.
John Onstead said:
I see the virtue in defining a gauge transformation as a principal bundle automorphism then! But how is this supposed to connect with the usual definition of a gauge transformation as specifying an element of the Lie group at each point?
That's not the correct definition of gauge transformation except for a trivialized principal bundle. Of course every principal bundle can be locally trivialized, and every principal bundle over can be trivialized, so you may have seen physicists act like a gauge transformation is just a smooth function from the base space to the Lie group . But that's not accurate in general.
If the automorphism acts fiberwise, then it is essentially defining an automorphism on each individual Lie group that comprises each fiber in the bundle.
Which bundle? It's important to note that the fibers of a principal -bundle are not groups. There is no way to multiply two elements of a fiber, and most especially there is no particular element of the fiber called 'the identity'.
They are, instead, torsors of the chosen Lie group !
That is, they are nonempty manifolds on which the group acts in a free and transitive way. Read my entertaining web page:
It turns out that if you have a principal bundle , each fiber for is a torsor of . The automorphism group of this torsor is isomorphic to , but not canonically, and in many cases this prevents us from identifying gauge transformations with smooth functions . (Not only is there not a canonical identification, there's none at all.)
But we can do it if our principal bundle is trivialized, or if is abelian.
Oh! So I learned principal bundles all wrong then. They aren't group objects in Smooth/B after all. Instead, each fiber is a torsor for, but not the Lie group itself. I'm going to need to think this through a little bit. But things are already starting to make more sense!
Right, they are not group objects in Diff/B. They are torsors for a group object in Diff/B.
That group object is the trivial bundle of Lie groups .
You'll see this if you read the usual definition of principal bundle on Wikipedia. Here they are working topologically instead of smoothly:
A principal -bundle, where denotes any topological group, is a fiber bundle together with a continuous right action such that preserves each fiber of and acts freely and transitively (meaning each fiber is a -torsor) on them in such a way that for each and , the map sending to is a homeomorphism.
This says exactly that is an object in that's a torsor for the group object given by the trivial bundle of groups .
(Note from the Wikipedia quote that we use right actions of here: there are right torsors and left torsors, but typically 'torsor' means 'right torsor'. This is an arbitrary convention.)
This makes sense but there's still a loose end. If fibers of the principal bundle are torsors and not Lie groups, then why do so many online articles (the ones I "learned" what a principal bundle was initially) state that the fibers of a G-principal bundle are diffeomorphic to G by the free and transitive property of the action? How can a torsor- the fiber of the G-principal bundle- be isomorphic to G itself when G is a Lie group and not a torsor?
John Onstead said:
This makes sense but there's still a loose end. If fibers of the principal bundle are torsors and not Lie groups, then why do so many online articles (the ones I "learned" what a principal bundle was initially) state that the fibers of a G-principal bundle are diffeomorphic to G by the free and transitive property of the action?
Because it's true.
They didn't say the fibers are groups or that the diffeomorphism is a group isomorphism.
How can a torsor- the fiber of the G-principal bundle- be isomorphic to G itself when G is a Lie group and not a torsor?
Nobody said they were isomorphic as groups!
Given a group object in a category with finite products, we can define a torsor of . The underlying -object of is necessarily isomorphic, as an object of , to the underlying -object of .
If you pick such an isomorphism you can use it to transfer the group object structure from to and make into a group object in . But the resulting group object structure depends on !
So, we say a torsor is 'not canonically' a group - or if we're impatient we say it's not a group, dammit. :upside_down:
All these nuances matter hugely in physics, btw: this is not just pedantic piddling around.
I guess it makes sense. The diffeomorphism exists between the underlying manifold of a principal bundle fiber and the underlying manifold of the Lie group, but not between the Lie group and fiber directly. I probably got led astray in the confusion. Maybe all mathematicians should take a page from category theory and actually specify between some object X and its underlying object U(X) rather than just using the same symbol to describe both. It's so confusing!
The diffeomorphism exists between the underlying manifold of a principal bundle fiber and the underlying manifold of the Lie group, but not between the Lie group and fiber directly.
Okay, sure. This sentence threw me completely for a while, and I wrote some wrong stuff which I have now deleted. Sorry. This sentence is just not something mathematicians would say.
Any red-blooded mathematician would say
"diffeomorphism between the principal bundle fiber and the Lie group"
and mean
"diffeomorphism between the principal bundle fiber and the underlying manifold of the Lie group"
because.... what else could it mean? The word "diffeomorphism" tells us we're working in Diff now, not LieGp. So we must be looking at the underlying manifold of the Lie group.
It's like how we say and not " plus the underlying real number of the integer ". We do the necessary type coercion to make things make sense! Only when there's a possibility of two different things we might consistently mean, do we specify which one we mean.
John Baez said:
It's like how we say π+2 and not "π plus the underlying real number of the integer 2". We do the necessary type coercion to make things make sense! Only when there's a possibility of two different things we might consistently mean, do we specify which one we mean.
I think I understand this analogy, I guess it's a matter of adapting my mindset!
I reviewed torsors, and I especially liked the "entertaining web page". I wish I had stumbled on that before learning about principal bundles! From my understanding, it seems that a torsor acts like a set of quantities that don't take on meaning unless they are relative to some other quantity or baseline. It is a group action on a set S such that for any two elements x and y in S, there is a unique group element that sends x to y. This means that, if we fix x, there is a unique correspondence between group elements and all the other elements of S, where we associate an S element y to the group element that acts on x to produce y. This creates a bijection between elements of the group (the underlying set of the group) and S itself, one for each element of S that we can fix. These are, I believe, called "trivializations" and define a "baseline" we can view the other elements relative to (which explains how we can use a specific trivialization equipped onto a torsor to define a group).
This makes me think it might be possible to view a (local) section of a principal bundle as an assignment of local coordinate systems to points in the base space, since a choice of an element in each torsor fiber corresponds to a choice of trivialization for each torsor fiber. But even if this is true, I'm still a little confused when it comes to what started this discussion in the first place with the automorphisms. Above you mentioned "the automorphism group of this torsor is isomorphic to G, but not canonically". But from my reading, the thing that's isomorphic but not canonical isn't some automorphism, but rather are the trivialization isomorphisms from the underlying set of G to S. Is there something I'm misunderstanding?
From my understanding, it seems that a torsor acts like a set of quantities that don't take on meaning unless they are relative to some other quantity or baseline. It is a group action on a set S such that for any two elements x and y in S, there is a unique group element that sends x to y. This means that, if we fix x, there is a unique correspondence between group elements and all the other elements of S, where we associate an S element y to the group element that acts on x to produce y. This creates a bijection between elements of the group (the underlying set of the group) and S itself, one for each element of S that we can fix.
Exactly!
This makes me think it might be possible to view a (local) section of a principal bundle as an assignment of local coordinate systems to points in the base space, since a choice of an element in each torsor fiber corresponds to a choice of trivialization for each torsor fiber.
I don't think "local coordinate system" is what you get, but there's something to your intuition. Let's do an example:
In the special case of general relativity the relevant principal bundle over the spacetime has fiber at whose elements are orthonormal bases of the tangent space .
is a principal -bundle where is the Lorentz group. And the points of the fiber , which are orthonormal bases, are called frames. A frame determines a coordinate system on the tangent space , or if you prefer, a way of identifying with Minkowski spacetime.
So this coordinate system is not "local": you might call it "microlocal" or "infinitesimal", since it's only a coordinate system on the tangent space of the point .
Above you mentioned "the automorphism group of this torsor is isomorphic to G, but not canonically". But from my reading, the thing that's isomorphic but not canonical isn't some automorphism, but rather are the trivialization isomorphisms from the underlying set of G to S. Is there something I'm misunderstanding?
Don't say "the" thing that's isomorphic but not canonical: there's more than one! It's a basic fact about -torsors that given a -torsor , it's group of automorphisms as a -torsor is isomorphic to , but not canonically. Proving this might be a good exercise.
As a result:
Given a principal -bundle , each fiber is a -torsor, and thus diffeomorphic to , but not canonically.
Given a principal -bundle, each fiber is a -torsor, and thus its group of automorphisms as a -torsor, say , is isomorphic as a Lie group to , but not canonically.
These groups are the fibers of a bundle of Lie groups over , called . A section of this bundle is the same as a gauge transformation of !
It's easy to confuse and since they both have fibers that look a lot like , but they are conceptually different and not isomorphic in any sense at all.
I think before delving into this stuff it's really crucial to show that that given a -torsor , its group of automorphisms as a -torsor is isomorphic to , but not canonically. And a good warmup for this problem is to do the case where is regarded as a torsor over itself. Even here the exercise is nontrivial.
John Baez said:
I think before delving into this stuff it's really crucial to show that that given a G-torsor T, its group of automorphisms as a G-torsor is isomorphic to G, but not canonically.
Hmm, I'll give it a shot, but this attempt might be handwavy and non rigorous (maybe I'll try to do it more rigorously at a later point in time). Every automorphism on a G-torsor corresponds to an automorphism on the underlying set S as well, so we have to figure out which ones since not every automorphism on S will preserve the torsor structure. Being extremely handwavy, it seems the only ones that will are those that act as a "change of basis" on the torsor if we had specified some trivialization bijection already. So basically, let's say we have fixed an element x in S and we have the corresponding trivialization bijection from G to S. An automorphism on S should have the effect that, when composed with this bijection, it gives you another trivialization bijection from G to S. So let's say the automorphism maps x to y. Then, it's like we are changing base from seeing x as being like the "identity" to now y playing that special role. Since there's a unique such trivialization bijection for each element of S, this means that there's only one automorphism on S mapping x to each other element that abides by this rule, thus putting the elements of the automorphism group and the elements of S in bijection with one another.
Now it's time to address the group structure and why a group isomorphism from G to Aut(G-torsor) for some G-torsor might not be canonical. The group structure for automorphisms is given by composition of automorphisms with the identity element given by the identity morphism. Let's say we fix an element x in S, then the automorphism sending x to y will act like "multiplication by y" since x is acting as the "identity" in this role, and the thing you multiply the identity by to get some y is y itself. But if I instead fixed y, then the automorphism sending x to y cannot act as "multiplication by y" since y in this new point of view is acting as the "identity", so "multiplication by y" is now given by a different automorphism, the identity morphism. As such, depending on which element you fix, there are subtle differences in how to interpret what the automorphisms are doing, and thus how to interpret the elements of the torsor's automorphism group. Again, since these differences correspond to each way you fix an element of S, each one gives rise to its own isomorphism from G to the automorphism group of the G-torsor.
Really hope I didn't write a whole bunch of gibberish but I really did want to give this one a try to get some practice!
I feel like I learned a lot yesterday in clearing up my misconceptions of principal bundles and having better understanding of them and transformations on them. I want to continue this momentum by moving into the next topic, which is discussing connections on bundles. As mentioned, I found this topic quite difficult to understand, and so I wanted to go through what is going on in detail. So first, there's the most general notion of a connection, which is an Ehresmann connection. These might be worth getting into later, but for now they are too abstract (which might be surprising coming from me, but I'm on a mission here to understand gauge theory), and anyways one can show a one to one correspondence between Ehresmann connections and principal connections as well as vector bundle connections when we add that extra structure onto the bundle. Principal connections and vector bundle connections are given by a notion of a 1-form and a covariant derivative respectively.
Here's what I know so far. First off, there's a very close relation between the notion of a connection and path. I think this comes from the inherent path-dependence of parallel transport that comes from curvature (IE, the classic example is a sphere where you parallel transport a vector from the poles to the equator, then along the equator, and then back up to the pole, only to find it perpendicular to the starting vector). John Baez linked above to a categorical approach to principal connections for G-principal bundles, which define them as functors from a path groupoid of the base space to G as a single object category that assigns a path (more precisely, a thin homotopy class of paths) on a manifold to an element of G representing its holonomy. The 1-form definition of a principal connection also involves paths as stated in the article: a 1-form is a natural object to integrate along a path, and the holonomy can be calculated from integrating the connection 1-form along the path (plus a few other steps which I'm still trying to understand). Also these two notions of holonomy converge somehow (making the holonomy group a "connected Lie subgroup" of G). There's also a notion of holonomy for vector connections (covariant derivatives) which allows you to actually find the function on a tangent space at a point that sends a vector to its parallel transported form given a loop starting and ending at that point (so for instance this map will take in a vector on the pole of a sphere, and given the path described above, will return a vector perpendicular to it, kind of like a rotation by 90 degrees).
A lot to unpack in the above. So far, it seems that the connection is some sort of object we use to describe the "infinitesimal" effects of curvature on a space, and holonomy is sort of the "net" effect of this curvature along a path when you "add together" these infinitesimal contributions. My first question is: is this an accurate assessment? And if so, then exactly how do you find the holonomy of a path in a vector bundle connection via the covariant derivative?
John Onstead said:
John Baez said:
I think before delving into this stuff it's really crucial to show that that given a G-torsor T, its group of automorphisms as a G-torsor is isomorphic to G, but not canonically.
Hmm, I'll give it a shot, but this attempt might be handwavy and non rigorous (maybe I'll try to do it more rigorously at a later point in time). Every automorphism on a G-torsor corresponds to an automorphism on the underlying set S as well, so we have to figure out which ones since not every automorphism on S will preserve the torsor structure. Being extremely handwavy, it seems the only ones that will are those that act as a "change of basis" on the torsor if we had specified some trivialization bijection already. So basically, let's say we have fixed an element x in S and we have the corresponding trivialization bijection from G to S. An automorphism on S should have the effect that, when composed with this bijection, it gives you another trivialization bijection from G to S. So let's say the automorphism maps x to y. Then, it's like we are changing base from seeing x as being like the "identity" to now y playing that special role. Since there's a unique such trivialization bijection for each element of S, this means that there's only one automorphism on S mapping x to each other element that abides by this rule, thus putting the elements of the automorphism group and the elements of S in bijection with one another.
Now it's time to address the group structure and why a group isomorphism from G to Aut(G-torsor) for some G-torsor might not be canonical. The group structure for automorphisms is given by composition of automorphisms with the identity element given by the identity morphism. Let's say we fix an element x in S, then the automorphism sending x to y will act like "multiplication by y" since x is acting as the "identity" in this role, and the thing you multiply the identity by to get some y is y itself. But if I instead fixed y, then the automorphism sending x to y cannot act as "multiplication by y" since y in this new point of view is acting as the "identity", so "multiplication by y" is now given by a different automorphism, the identity morphism. As such, depending on which element you fix, there are subtle differences in how to interpret what the automorphisms are doing, and thus how to interpret the elements of the torsor's automorphism group. Again, since these differences correspond to each way you fix an element of S, each one gives rise to its own isomorphism from G to the automorphism group of the G-torsor.
That's quite nice! If we nail it down a bit further we'll get some new insights.
Let's say a G-torsor is a nonempty set equipped with a free and transitive right action of the group G. (The choice of right is arbitrary but standard here.)
What's the automorphism group of this torsor?
Well, if you believe me that every torsor is isomorphic to G acting on itself with right multiplication, we can settle this question when our torsor is G itself, and we'll know the answer in general.
(Again, if I were John Onstead I might say "every torsor is isomorphic to the torsor whose underlying set is the underlying set of G, with G acting on this set by right multiplication"... or something like that. Sprinkle on underlying functors as needed to make what I say parse: I talk like an ordinary mathematician.)
So let's suppose we have this torsor: the group acting on itself by right multiplication. What are the automorphisms of this torsor? I.e.:
Puzzle. Given a group , which functions
obey
for all ?
John Baez said:
So let's suppose we have this torsor: the group G acting on itself by right multiplication. What are the automorphisms of this torsor? I.e.:
Puzzle. Given a group G, which functionsf:G→G
obey
f(gh)=f(g)h
for all g,h∈G?
Well if h was the inverse element of g, g^-1, then we get an equation f(g g^-1) = f(g) g^-1. This becomes f(id) = f(g) g^-1 because the composition of g with its inverse is the identity element. This means the action of the function on any element g is determined by f(g) = f(id) g. I'm not sure if this helps in solving the puzzle! But also above I demonstrated for a general G-torsor (maybe a left one though) that you can put actions/multiplications and elements of S into bijection, and that you can put automorphisms and elements of S in bijection, thus putting automorphisms and actions into bijection, all so long as you make sure to specify some starting point in your torsor!
I'm not sure if this helps in solving the puzzle!
It helps a huge amount! So solve it: tell me exactly which functions obey
for all .
But also above I demonstrated for a general G-torsor (maybe a left one though) that you can put actions/multiplications and elements of S into bijection,
That's true for both left and right torsors, so please humor me and work with right ones now, since my equation was for right torsors.
and that you can put automorphisms and elements of S in bijection, thus putting automorphisms and actions into bijection, all so long as you make sure to specify some starting point in your torsor!
I think this is a second proof that there is a bijection between the set of obeying the equation above and the group . But what I'm asking is... exactly which functions obey the equation above? There must be one such function for each element , good - but what is it?
There's a simple formula for it, which you can extract from what you've already said.
John Baez said:
There's a simple formula for it, which you can extract from what you've already said.
I think f(g) = f(id) g is about as far as I can go unfortunately! It's telling you that for every f that is an automorphism on the torsor, the action of the function f will correspond to multiplication by f(id). That's about as close to a formula as you can get, no? Then conversely the functions f that are automorphisms are the ones defined by some group multiplication.
Great! Every map of right -torsors from to itself has
so is given by left multiplication by some element of , namely . And you can check that for any , the map
really is a map of right -torsors:
You can check that is not just a map but an automorphism of right -torsors, and that is uniquely determined by (since it's , as you said).
Indeed there's an isomorphism
given by
In fact, taking the underlying -torsor of is a great way to construct a thing whose automorphism group is !
Ultimately this is why principal -bundles are important: they are bundles of things whose automorphism group is .
I haven't done that many mathematical exercises recently, so this was good practice! It makes things clearer too, when understanding what we are doing with principal bundles.
Though with that out of the way maybe we can move onto connections now? :)
Sure, I just wanted to get to the real point of torsors before leaving them.
In a way it's a pity you don't want to talk about Ehresmann connections - that's the name for a connection on a principle bundle, right? That would let us leverage all this stuff about principal bundles and torsors. But I agree that it's one of the more confusing approaches at first: especially if one is used to physics, connections on vector bundle are easier to appreciate. That's the only kind of connection I talk about in Gauge Fields, Knots, and Gravity: connections on vector bundles.
John Baez said:
In a way it's a pity you don't want to talk about Ehresmann connections - that's the name for a connection on a principle bundle, right?
No, I still very much want to talk about principal connections! At least by wikipedia, an Ehresmann connection is a separate and more general concept than a principal connection since it applies to any smooth fiber bundle, regardless of the structure on it. But I wanted to get into principal connections later, which is why my first question is on vector bundle connections. Basically, all I wanted to know was how, in the context of vector bundle connections, the covariant derivative, the holonomy along a close looped path starting and ending at a point x, and the parallel transport map TxP -> TxP for that path, all interrelate. Preferably, would there be some explicit formula that has all these concepts in it with an equals sign so I can explicitly see the relationship?
Parallel transport along a path is the solution of an ODE involving covariant differentiation. If the path is a loop, parallel transport all around the loop is called 'holonomy'.
John Baez said:
Parallel transport along a path is the solution of an ODE involving covariant differentiation. If the path is a loop, parallel transport all around the loop is called 'holonomy'.
Ok, that's actually pretty straightforward.
Now it's time to get into the principal connections, and for my questions about this I want to understand them on a conceptual level before technical details. While vector bundles connections make intuitive sense (they correspond to parallel transport), I'm still trying to conceptually understand what exactly principal connections are doing, especially considering that the fibers of a principal bundle are torsors. What is the connection doing with the torsors exactly- there's no vectors (at least not yet) for it to be parallel transporting.
The second thing I want to understand on a conceptual level is the formula given in your paper for holonomy of a principal bundle. This is a path ordered exponential P exp(int_y A). A, as a 1-form, is meant to be integrated along a path, which is exactly what int_y A is doing. But then why do the path ordered exponential on top of that? Basically I guess I want to know, conceptually, what the integral is doing: what it is "adding up" (I guess this would tie into the first question above) and what quantity is the end result of the integral before we do the exponential. Then I want to understand what the exponential is doing to that quantity and why it's necessary. Lastly, I'm curious about how this aligns with my previous thought that connections represent some sort of infinitesimal effect of curvature while holonomy is a way to take these infinitesimal details at each point along a path and get the net result of the influence of curvature on things all along that path.
John Onstead said:
Now it's time to get into the principal connections, and for my questions about this I want to understand them on a conceptual level before technical details.
Okay. By the way, nobody ever says "principal connections", for some reason. They always say "connections on principal bundles" or something.
While vector bundles' connections make intuitive sense (they correspond to parallel transport), I'm still trying to conceptually understand what exactly principal connections are doing, especially considering that the fibers of a principal bundle are torsors.
Here's the idea. Suppose we have a principal -bundle with a connection on it. Ultimately, given a smooth path in from to , we want our connection to give us an isomorphism of -torsors
This isomorphism is called parallel transport along , but a lot of us modern folks also call it the holonomy along and write it as
because we say "holonomy: it's not just for loops anymore!" (Like that ad campaign: "milk: it's not just for breakfast anymore!")
But for a connection we want to define this at the 'infinitesimal' level: i.e. we imagine is very close to , and we imagine we're moving from to in the direction of some tangent vector . So we need to understand what the "infinitesimal" version of an isomorphism of torsors should be!
If we were doing synthetic differential geometry this would be a snap - after you've spent ten years learning synthetic differential geometry. :upside_down:
But since we're hidebound traditionalists we'll do it using differential forms, vector bundles and stuff like that - after we've spent ten years learning that stuff.
I've got to quit for now, but I hope this at least provides food for thought.
In the meantime, please visualize a manifold with a bunch of -torsors sitting over it - especially easy when or - and imagine what isomorphisms between these -torsors look like! If we were talking in person I would have already drawn pictures to illustrate this.
I've thought about it a bit and now I think I have some answers! I know that for two isomorphic objects, every isomorphism between them is in correspondence with automorphisms on one of them. So an isomorphism from one fiber to another looks like applying some automorphism on your starting fiber. And we know already from above what an automorphism on a torsor is- given we've specified a starting point, it's a way to "change base" by essentially acting as a map taking one coordinate system to another. So if G = S1, then an isomorphism between fibers would correspond to rotations of a circle without a fixed basis or zero point, and if G = R, then an isomorphism between fibers would correspond to translations of a line, again without a fixed basis or zero point. So the curvature of a principal bundle seems to mean you need to have a change in coordinates when you are comparing two points along a path through this curvature.
But now here's my question. Holonomies apparently correspond to group elements of the group G. But remember, automorphisms on (and thus isomorphisms between) torsor fibers are not canonically in bijection with elements of the group G. We need to fix a point in the torsor first before we can figure out which group element will transform our fiber in a certain way. This seems to create a massive contradiction: how can a holonomy be both a group element- not canonically bijective with isomorphisms between torsor fibers- and an isomorphism between torsor fibers at the same time!?
John Onstead said:
But now here's my question. Holonomies apparently correspond to group elements of the group G.
They "correspond", but not in any canonical way... unless put some extra structure on the situation. For example, physicists often choose a trivialization of their principal G-bundle (which can always be done locally, and can be done globally when the base space is , as it is in special relativity). This amounts to choosing an isomorphism of each fiber with G. Using that, they identify holonomies with elements of G.
But remember, automorphisms on (and thus isomorphisms between) torsor fibers are not canonically in bijection with elements of the group G. We need to fix a point in the torsor first before we can figure out which group element will transform our fiber in a certain way. This seems to create a massive contradiction: how can a holonomy be both a group element- not canonically bijective with isomorphisms between torsor fibers- and an isomorphism between torsor fibers at the same time!?
I tried to answer that above. Often when you encounter 'contradictions' of this sort it's because people have equipped the situation with extra structure that lets them make otherwise noncanonical choices.
It's like what we did above when we specified a local section of the principal bundle, it was like a choice of trivialization for each fiber. We could then define a bundle where the fibers were the Lie group itself. This seems to be the same kind of thing!
Then I guess my next question is what exactly is going on in the calculation P exp(int_y A)? A is a g-valued 1-form where g is the Lie algebra associated to G, thus taking an exponential will upgrade it into an element of G. But as you mentioned above, this element only makes sense in the context of a choice of trivialization. Does that mean the connection itself only make sense in the context of a choice of trivialization? I'm also a little confused about what this formula is doing exactly... it seems to be taking infinitesimal things and making them not infinitesimal (like we would expect; as you mentioned above, a connection is supposed to be like an infinitesimal holonomy) but it seems to be doing this in two ways at the same time. The integral adds up infinitesimal things to make them finite, and the exponential takes infinitesimal transformations about the origin of a Lie group and makes them finite (IE, actual elements of the Lie group). So wouldn't doing both be a little redundant?
Then I guess my next question is what exactly is going on in the calculation P exp(int_y A)? A is a g-valued 1-form where g is the Lie algebra associated to G, thus taking an exponential will upgrade it into an element of G.
Note that you don't first integrate A, then exponentiate it, and then write the letter P in front to make it look cooler. :upside_down:
The 'path ordered exponential' is more subtle than that: see Wikipedia, where some jerk has replaced the usual letter P with the letters OE.
But as you mentioned above, this element only makes sense in the context of a choice of trivialization. Does that mean the connection itself only make sense in the context of a choice of trivialization?
No, that would render it utterly useless! Please have a little faith that gauge theory, when done right, does not depend on a trivialization. However, intermediate steps of calculations make use of local trivializations. If they didn't, we wouldn't require that fiber bundles be locallly trivializable! We need to use that assumption all over the place.
So here's one standard way to define holonomy.
You can take your path chop it into short pieces each of which lies in a contractible open subsets of the base, trivialize the bundle over each open subset, use the formula given on WIkipedia to define the holonomy along each short segment $$\gamma_i$, which is an element
where and are the starting point and ending point of that short segment, and then compose these 's to get the holonomy of the whole path, . Then check that nothing depended on your choice of local trivializations!
All this is explained better in my book Gauge Fields, Knots, and Gravity, and I would not be at all offended if you do what everyone else does, which is to download a free copy from LibGen. In fact it would reduce the amount of repetitive strain injury I get from typing this stuff!
John Baez said:
Note that you don't first integrate A, then exponentiate it, and then write the letter P in front to make it look cooler. :upside_down:
The 'path ordered exponential' is more subtle than that: see Wikipedia, where some jerk has replaced the usual letter P with the letters OE.
That explains a lot, though the notation is somewhat confusing! For instance, you do see expressions of the form exp(int f dx) where you are supposed to take this as a straightforward "do integral -> exponentiate", such as in Feynman's path integral formulation where f is the Lagrangian to calculate the action, and the exponential gives you the amplitude contribution of the path.
While some of the formulas in the article for the path ordered exponential look really complicated, the one that seems to be simplest is the one in terms of a product integral. There, instead of "do integral -> exponentiate", we instead are doing something similar to integration as a whole, but where we replace the summing with product-ing. The exponential is inside the product integral, which now makes more sense- if we interpret the function to be the connection, we are essentially taking a tiny step, getting a Lie algebra element, exponentiating that to get a group element, and then taking another tiny step, repeating the process, and multiplying the two results together, again and again for each tiny step. So you really are accumulating a whole bunch of tiny group elements over the path to create one big group element that represents the overall holonomy!
John Baez said:
All this is explained better in my book Gauge Fields, Knots, and Gravity, and I would not be at all offended if you do what everyone else does, which is to download a free copy from LibGen. In fact it would reduce the amount of repetitive strain injury I get from typing this stuff!
This was not my intention to give you RSI! I do apologize; I'm not aware what is and isn't covered by the book, but I'll take a look into it so I can figure that out. It's just that sometimes I have very specific questions and it's hard to find direct answers to them in a book without doing a lot of jumping around and putting the pieces together, which I might not always have time to do!
We are essentially taking a tiny step, getting a Lie algebra element, exponentiating that to get a group element, and then taking another tiny step, repeating the process, and multiplying the two results together, again and again for each tiny step. So you really are accumulating a whole bunch of tiny group elements over the path to create one big group element that represents the overall holonomy!
Exactly! That's exactly what is going on - at least when we have a connection for a trivialized bundle, which gives a way to treat holonomies as group elements. In general we need to think of them as maps between different fibers of a principal bundle. But there's probably no way to understand that more sophisticated case without first understanding the case of a trivialized bundle, as you're doing now. (This is an example of how people need to understand groups before they can understand groupoids.)
[....] the notation is somewhat confusing! For instance, you do see expressions of the form exp(int f dx) where you are supposed to take this as a straightforward "do integral -> exponentiate", such as in Feynman's path integral formulation where f is the Lagrangian to calculate the action, and the exponential gives you the amplitude contribution of the path.
This is a special case of what we're talking about now - namely holonomies. This is the special case where 1) the principal -bundle has been trivialized, and also 2) the Lie group is abelian, namely the group of unit complex number .
Because the Lie groups is abelian, when we "accumulate a whole bunch of tiny group elements over the path to create one big group element that represents" it doesn't matter what order in which we accumulate them. This causes a drastic simplification: we can just do an ordinary integral and then exponentiate it!
In fact, the Lie algebra of is just with vanishing Lie bracket. So when physicists write something like
where is some real-valued function called the 'Lagrangian', they are computing the holonomy of a connection on a trivialized principal -bundle! And the smarter ones all know this.
We could write this integral as path-ordered:
but in this particular case, because the Lie algebra is abelian, the integral makes sense even without the path ordering, and the path ordering does not change the result!
It's just that sometimes I have very specific questions and it's hard to find direct answers to them in a book without doing a lot of jumping around and putting the pieces together, which I might not always have time to do!
Then you're lucky I have so much time myself - in fact, I've been given a lifetime supply, which will run out only when I die. :upside_down:
I would like to say, thanks so much for your help so far. The above explanation was really helpful and it certainly left me with things to ponder, like what it means- in a mathematical and philosophical sense- that Feynman's path integral can be thought of in terms of holonomies in this way. In the meantime, I do have some time this weekend so I'll take it to look through the book and I'll come back later with questions, perhaps including any I have from the reading!
Great! It doesn't say anything about principal bundles, but it says a lot about vector bundles, gauge transformations and connections for those, how to define holonomies for those, etc.
Let me say something that's hard to find explained well - I prefer to explain things that aren't in my book.
connections play at least 3 roles in physics:
First, every quantum system has symmetry, called 'phase symmetry', because you can multiply any vector in a Hilbert space by a unit complex number, and this operation commutes with all linear operators. In the path-integral approach to quantum mechanics, as a quantum particle evolves in time, its state vector gets multiplied by an element of which is the holonomy of a connection.
Second, electromagnetism is described by a connection, so any charged particle moving through spacetime gets acted on by a holonomy, in a manner that depends on its electric charge (which determines a representation of .
It's important to realize that this is a conceptually different connection than the first one! For example an electrically neutral particle is unaffected by the second sort of transformation but not the first.
Third, the group in the Standard model is not associated to electric charge: it's associated to some other kind of charge, called hypercharge. Electric charge is associated to some other, nonobvious subgroup of , which is not in the center of this group. There's a famous formula relating electric charge to hypercharge and weak isospin, the Gell-Mann-Nishijima formula.
That's really interesting! Generally, whenever I think of U(1) symmetry, I think electromagnetism, but it appears to reach beyond that. I'm also really surprised to learn that the U(1) of traditional QED and that of the Standard Model are different from one another- totally didn't expect that (and none of the videos I've watched about the Standard Model mentioned that either, though I guess they have to gloss over some details)! Maybe it's to do with the Higgs spontaneous symmetry breaking messing everything up? In any case the Higgs is something I'd eventually like to get to, but only after I have a good grasp on the basics of pure gauge theory.
I've been going through your book, but I still have a bit to go. In the meantime, I'd like to know how just one group, U(1), and the singular principal bundle corresponding to it (for some given manifold) give rise to so many different manifestations across physics. I suspect it might have to do with choosing different representations and associated bundles for each potential application. Is this true? Also, any interesting facts you'd like to share about representations of bundles and associated vector bundles you might not have covered in your book (IE, things from a categorical POV)?
I'm also really surprised to learn that the U(1) of traditional QED and that of the Standard Model are different from one another- totally didn't expect that (and none of the videos I've watched about the Standard Model mentioned that either, though I guess they have to gloss over some details)! Maybe it's to do with the Higgs spontaneous symmetry breaking messing everything up?
The Wikipedia article Higgs mechanism explains it.
The Higgs field takes values in . When it falls to its ground state, its expected value determines a unit vector . The electroweak symmetry group acts on , where I'm using the term because this copy of is associated not to electric charge but to hypercharge, which is denoted .
The subgroup of that preserves the vector is also isomorphic to , and this is the copy of the describes the electromagnetic field. Let's call it .
The Wikipedia article gives formulas at the Lie algebra level describing how sits inside .
It's nice to visualize this at the group level: is the product of a 3-sphere and a circle, and is a circle in this product that goes once around an arbitrary great circle in the 3-sphere while it's also going once around the circle .
(The Wikipedia article makes this very unclear.)
The choice of the great circle is determined by the unit vector : it's an arbitrary choice due to spontaneous symmetry breaking.
I'm working through the book and I'm on the curvature section. I might have missed the page that explains it, but I'm extremely confused about the precise bundle the curvature tensor is a section of. For instance, given U(1)-principal bundle and a representation (for instance, a spin-1/2 spinor representation), I want to identify the electromagnetic tensor Fuv as the section of some bundle. It seems to be a g-valued 2-form, which would imply that it is a section of the vector bundle (g * M) x ext T*M that is the exterior power of the bundle (g * M) x T*M that the connection 1-form itself is a section of. However, this seems to imply that the curvature exists independently of a choice of representation, but this doesn't seem right to me! The book seems to define the curvature in terms of an "exterior covariant derivative" which makes me believe the representation might matter, since this derivative requires a vector bundle (which would be the associated bundle). Can you clarify what is going on here?
The book doesn't talk about principal bundles, so it sounds like you're trying to guess some stuff based on the book?
On page 214 it defines what it means for a vector bundle to be a G-bundle for a Lie group G. This means that the fiber V is a representation of G and some other stuff holds too. So if you're looking around for a representation of G, it's there as soon as you have a G-bundle.
On page 233 it defines a connection on a vector bundle.
On page 228 it defines a G-connection on a vector bundle that's a G-bundle.
On page 243 it defines the curvature of any connection on any vector bundle.
On page 252 it discusses the curvature of a connection on a vector bundle that's a G-bundle.
All these seem to be prerequisites for getting into your question, which additionally brings principal bundles into the mix, so let me just check that you know all this stuff.
John Baez said:
The book doesn't talk about principal bundles, so it sounds like you're trying to guess some stuff based on the book?
On page 214 it defines what it means for a vector bundle to be a G-bundle for a Lie group G. This means that the fiber V is a representation of G and some other stuff holds too.
I think this was confusing me. I saw "G-bundle" as "G-principal bundle" so I assumed that was what you were writing about. Maybe we here should call them "G-associated bundles" to avoid confusion in the future!
John Baez said:
On page 243 it defines the curvature of any connection on any vector bundle.
On page 252 it discusses the curvature of a connection on a vector bundle that's a G-bundle.
I looked through this again and here's what I could understand. On page 245 it says "By a result of the previous chapter, this means F(u,v) corresponds to a section of End(E)" which seems to explicitly answer my question- it's a section of End(E), whatever that is (where E is our vector bundle E -> M). The mysterious End(E) (defined on page 221 to be E x E* ) is brought up again on page 252, where now it is somehow interrelated with g. On page 225 it defines an "End(E)-valued 1-form" as End(E) x T*M, so maybe 252 is stating that the Fuv is a section of End(E) x ext T * M? In which case, does End(E) have any relation to (g * M)? (g * M) is the vector bundle you get when you take the product of the underlying manifold of the Lie algebra g with M and get the projection g x M -> M, and then add in the vector space structure onto the bundle that makes each fiber of this projection into a vector space which is the underlying vector space of the Lie algebra g. Does any of this make sense? I'm so confused and getting mixed up with all these kinds of bundles!
Also it's extremely confusing that on page 222 a "gauge transformation" is defined to be a section of End(E) which corresponds to an endomorphism on E, which directly contradicts the assertion made above that a gauge transformation is an automorphism not a generic endomorphism of E (an automorphism is a special kind of endomorphism, but not all endomorphisms are automorphisms!)
John Onstead said:
I think this was confusing me. I saw "G-bundle" as "G-principal bundle" so I assumed that was what you were writing about. Maybe we here should call them "G-associated bundles" to avoid confusion in the future!
People usually say G-bundle to mean any fiber bundle built from a principal G-bundle using the associated bundle trick, which depends on choosing a space called the 'fiber' F on which G acts on the left. Special cases:
If F = G and we let G act on itself on the left, our G-bundle is called a principal G-bundle.
If F = V is a vector space and G acts as linear transformations on V, our G-bundle will be a vector bundle.
Since my book is mainly about vector bundles, when I say G-bundle, I mean one of type 2.
Also it's extremely confusing that on page 222 a "gauge transformation" is defined to be a section of End(E) which corresponds to an endomorphism on E.
That's not my definition of "gauge transformation". That would indeed be wrong.
I'll tackle your longer question later - I've got a meeting now!
On page 225 it defines an "End(E)-valued 1-form" as End(E) x T*M
You meant to say it defines an End(E)-valued 1-form as a section of the vector bundle End(E) T*M. Indeed that's what it is.
In which case, does End(E) have any relation to (g * M)?
Yes, but we have to straighten out some things first. Let's fix something here:
(g * M) is the vector bundle you get when you take the product of the underlying manifold of the Lie algebra g with M and get the projection g x M -> M,
I don't think we ever want to do that, except locally.
Here's something similar that we do want to do.
Say that we start with a principal -bundle . We can build a vector bundle over from any representation of the Lie group on any vector space , using the associated bundle trick. The resulting vector bundle is often called .
Now, always has a god-given representation on its own Lie algebra , called the adjoint representation, .
So, from a principal -bundle we can always form the vector bundle
.
People call this vector bundle for short.
This is a bundle whose fibers are all isomorphic (as vector spaces and even Lie algebras!) to .
It's very important because:
But only when is the trivial principal -bundle
do we expect to be isomorphic to the trivial bundle of Lie algebras
Of course, all bundles are trivializable locally, so for local computations we can assume without loss of generality that is trivial. Then instead of -valued -forms we can work with -valued -forms.
But for global issues (the opposite of local issues), we should work with when doing gauge theory, not .
This is interesting, I hadn't seen this adjoint bundle before but it looks cool! Wikipedia was first what told me that a connection on a principal G-bundle is precisely a section of the (g * M) x T*M bundle. The details are found on this page. Maybe someone should at least add a note onto the page that Ad(P) x T * M is a more accurate bundle to use instead! Edit: actually I just checked and there is a section on that page already for adjoint bundles but it's at the bottom rather than given priority at the top, quite odd!
I'm also extremely confused about curvatures as sections because, after learning more about curvature, I no longer actually think it is a section of any bundle at all, but rather an operator on sections! I re-read your section on the exterior covariant derivative and it makes a little more sense to me now- given a connection on a vector bundle V (such as Ad(P)), it's an operator Sections(ext^k T*M x V) -> Sections(ext^k+1 T*M x V) that takes in a V-valued k-form and outputs a V-valued k+1 form. When k=0, this gives you back the "normal" covariant derivative Sections(V) -> Sections(T * M x V). The curvature then is the failure for this exterior derivative to abide by the usual property of an exterior derivative, that squaring it equals zero. If you expand this out you get the usual expression for curvature in terms of commutators as R(X,Y) s = DX DY s - DY DX s - D[X,Y] s, with s a section of V. But this isn't a section of anything- instead, given a fixed choice of X and Y, it's actually a map between sections of V: R(X,Y): Sections(V) -> Sections(V)! Thus, the curvature is not itself a section, but actually an operator that acts on sections! But then I remember your book (and you above) mentioned that curvature can be considered a section of a bundle and I've helplessly fallen into confusion again.
You always sound so desperate. It's supposed to be fun. :upside_down:
If you read my book you'll see curvature of a connection on a vector bundle is a bunch of operators on sections of this bundle
(where are vector fields and is a section of ) with the nice property that all the information about it is contained in an -valued 2-form. The point is that eats and vector fields and and spits out an endomorphism of in a bilinear antisymmetric manner; this endomorphism acts on sections of via
Start at the bottom of page 248.
In a similar way, the curvature of a connection on a principal -bundle gives an -valued 2-form.
I should emphasize that these stories aren't separate. Given any -dimensional vector bundle there is a principal -bundle called its 'frame bundle'. Connections on this principal bundle correspond bijectively in a natural way to connections on the vector bundle .
Using this trick, connections on vector bundles can be treated as a special case of connections on principal bundles!
John Baez said:
You always sound so desperate. It's supposed to be fun.
I understand, but whenever I don't understand something, I get frustrated and it's hard for me to write frustrated without it leaking through in some way!
John Baez said:
I should emphasize that these stories aren't separate. Given any n-dimensional vector bundle E→M there is a principal GL(n)-bundle called its 'frame bundle'. Connections on this principal bundle correspond bijectively in a natural way to connections on the vector bundle E.
That's interesting! This is probably because if P is the frame bundle of E, then Ad(P) ~ End(E). It almost makes it seem like the "frame bundle" construction is, in some sense, the "opposite" of the associated bundle construction.
John Baez said:
The point is that F(u,v) eats and vector fields u and v and spits out an endomorphism of E in a bilinear antisymmetric manner
My apologies, but I'm still a little confused. F(u,v) is an operator F(u,v): Sections(E) -> Sections(E), thus it corresponds to an endomorphism on Sections(E) but not E itself. An endomorphism on E acts on points and fibers, not on sections- I guess you can compose a section B -> E with an endomorphism E -> E but I don't think this is what you mean. If it were, you are implying that in some cases an endomorphism Sections(E) -> Sections(E) can "lift" to an endomorphism E -> E. But there's a huge problem with this- sections can overlap with one another! For instance, let's say that you have two sections of E, s1 and s2, and at point p we have s1(p) = s2(p). Now, let's apply an operator O: Sections(E) -> Sections(E). You obviously can't guarantee that O(s1)(p) = O(s2)(p). For such an operator, you thus cannot define a corresponding endomorphism on E since it would require pulling a single point s1(p) apart in two different directions. So I think there's probably a point that I'm missing!
It's true that not all operations that take sections to sections can be represented by an endomorphism of the bundle, but there's something special about the curvature operations: they are local, which means among other things that when the input bundles overlap the output bundles also overlap in the same way.
Like @James Deikun said.
An endomorphism on E acts on points and fibers, not on sections- I guess you can compose a section B -> E with an endomorphism E -> E but I don't think this is what you mean.
Yes, that's exactly what I mean. Endomorphisms of a vector bundle act on sections in this way - and more interestingly, all sufficiently 'nice' maps from the set of sections to itself come from endomorphisms of . So, while curvature starts out as a bunch of maps sending sections to sections, because it's 'nice' these come from endomorphisms of .
Here's the general idea:
Given a real vector bundle over a topological space, its set of sections
is module of the ring of continuous real-valued functions on :
Then we have this nice theorem: vector bundle endomorphisms
are in bijective correspondence to module endomorphisms
The correspondence goes like this:
All this works for complex vector bundles as well as real ones, and for smooth vector bundles as well as topological ones, where we replace the ring of continuous functions by the ring of smooth functions .
The formula above defines if you have the bundle endomorphism . But in dealing with curvature we need the reverse correspondence, where we have a module endomorphism and we want to get a vector bundle endomorphism of from it!
The reverse correspondence can also be constructed in a very simple way, but I'll leave that as a little puzzle.
(Or you could read the section on curvature in my book. Everything I'm saying here is much more 'highbrow' than in my book. In my book I don't use general theorems to get as a bundle endomorphism, I just get the job done. In fact I may just charge ahead to the next step, which is to get the curvature as an -valued 2-form.)
James Deikun said:
It's true that not all operations that take sections to sections can be represented by an endomorphism of the bundle, but there's something special about the curvature operations: they are local, which means among other things that when the input bundles overlap the output bundles also overlap in the same way.
Thanks, this at least helps me see where the problem is. I guess the sensible follow up would be why the curvature operators are "local" in this way. In a sense, even the derivative operator Sections(R x R) -> Sections(R x R) isn't "local" in this way (even though derivatives are local operations). Let's say you have a section R -> R x R: x^2 and 6x-1. These two converge at x=1, but applying the derivative to both causes them to diverge at x=1 since x^2 corresponds to (1,2) and 6x-1 corresponds to (1,6). After thinking it through I think it's because the derivative isn't fully local- it depends on infinitesimally nearby points to the one you are looking at, rather than just the point itself. That means the curvature must somehow only depend on the value at a point rather than even on infinitesimally close points to it! But again I'm not sure how since intuitively, the curvature isn't a property at a single point either, it describes a "deformation" of a space over an area. In addition, the curvature is defined in terms of a differential operator (the covariant derivative), so how does the curvature have this "local" property when the operator used to define it does not? Kind of odd!
John Baez said:
are in bijective correspondence to module endomorphisms
But, I take it, module endomorphisms Sections(E) -> Sections(E) are not in bijective correspondence with general endomorphisms Sections(E) -> Sections(E)? In a sense, I proved there can't be a general bijection between endomorphisms on E and on Sections(E) above when I showed that not all operators on Sections(E) can arise from endomorphisms on E. Though now I'm confused about what Sections(E) even is. I discussed this a while above on this thread, but the conclusion was that it's a vector space. I'm guessing viewing Sections(E) as a module of C(B) is different from this perspective, though it does make me wonder how to view Sections(E) in this way (IE, in terms of dependent products and all that)
There are many ways to think about this, and you've got a nice intuitive way there. Here's another way. The operator
seems to involve first and second derivatives of . But if you calculate it (as I do in my book) you'll see that the value of at some point only depends on the value of at that point! All the derivatives of cancel out.
John Onstead said:
But, I take it, module endomorphisms Sections(E) -> Sections(E) are not in bijective correspondence with general endomorphisms Sections(E) -> Sections(E)?
What are 'general' endomorphisms? Endomorphisms of as a set? As a vector space?
There are tons more endofunctions on than vector space endomorphisms, of course.
And there are tons more vector space endomorphisms than endomorphisms of as a module over the ring of smooth functions. An example is the covariant derivative, which has
See why?
This is why it's exciting that the curvature is a module endomorphism. Compute
using the definitions and show a lot of stuff cancels leaving
Though now I'm confused about what Sections(E) even is.
Like most things in math, it's lots of things! It's a set, and a topological space, and a vector space, and a module of the ring of smooth functions on the base. It's your job to say which you mean at any given time. This is why you shouldn't talk about 'general' endomorphisms unless the category is clear.
And you have to be willing to change categories at a moment's notice.
John Baez said:
This is why it's exciting that the curvature F(u,v) is a module endomorphism. Compute
F(u,v)(fs)
using the definitions and show a lot of stuff cancels leaving
fF(u,v)s
I hope I did this correctly. So if I'm given the covariant derivative DX (fs), this requires a product rule which is apparently (Xf)s + f DX s. Doing that over and over again, things get messy but some things cancel out:
-So DX DY (fs) - DY DX (fs) seems to be (X(Yf))s - (Y(Xf))s + f DX DY s - f DY DX s.
-D[X,Y] (fs) = ([X,Y]f)s + f D[X,Y]s.
-So the total expression becomes F(X,Y) (fs) = (X(Yf))s - (Y(Xf))s + f DX DY s - f DY DX s - ([X,Y]f)s - f D[X,Y]s.
-Combining terms we get (X(Yf) - Y(Xf) - [X,Y]f)s + f(DX DY s - DY DX s - D[X,Y]s). But that last term is just F(X,Y)s so we get (X(Yf) - Y(Xf) - [X,Y]f)s + fF(X,Y)s.
-I think there's some rule [X,Y]f = X(Yf) - Y(Xf) so the first part becomes zero. So in the end you are left with F(X,Y)(fs) = fF(X,Y)s.
I really hope I did everything correctly, I might have made a few mistakes since there were so many terms and I might have gotten a few formulas wrong!
But in any case knowing F(X,Y)(fs) = fF(X,Y)s doesn't immediately help. How does this being true in any way imply that if s1(x) = s2(x) at a point x, then F(X,Y)(s1)(x) = F(X,Y)(s2)(x)? I will need to prove the latter if F(X,Y) truly does constitute an endomorphism.
Oh! Just noticed something interesting. If Sections(E) really does form a module over the ring of real valued functions on B, then indeed F(X,Y) does form a module endomorphism! The definition of a module homomorphism T requires that T(s x v) = s T(v) where s is a scalar and v is a vector. In the module Sections(E), this corresponds to requiring T(fs) = fT(s) where now s is a section of E and f is a real valued function. We can see easily that F(X,Y) satisfies this by the proof above that F(X,Y)(fs) = fF(X,Y)(s). If I then take you for your word that there's a bijection between module endomorphisms on Sections(E) and endomorphisms on E, then the proof is complete. Only unfortunate things is that I'm still finding it hard to take your word that this bijection exists!
Edit: Wait I think I understand a bit more. A real valued function on B assigns a real number to each point in B, and a section of E assigns a vector to each point in B. So as long as E is real valued, you can define a "multiplication" fs by doing scalar multiplication at each point between the real number defined there and the vector defined there. This gives the module structure. Still confused about the bijection but now I think I at least see how the module works!
John B. wrote:
This is why it's exciting that the curvature is a module endomorphism. Compute using the definitions and show a lot of stuff cancels leaving .
John O. wrote:
I hope I did this correctly.
Let me LaTeX your calculation so I can see and understand it better!
We're trying to show where by definition
Using the product rule
twice we get
and using it a third time we get
so we have
Combining terms this equals
but the second big expression here is just , and the definition of the Lie bracket is so the first expression is zero!
Nice, you did it just right.
My apologies, I'm still inexperienced in LaTeX though I probably should get used to working with it at some point!
Only unfortunate things is that I'm still finding it hard to take your word that this bijection exists!
Don't take my word for it!
I described the map from vector bundle endomorphisms of to module endomorphisms of . Given a vector bundle endomorphism
we get a module endomorphism
in this way:
It's quite easy to check that this direction works.
I left it as an exercise getting the inverse map from module endomorphisms of to vector bundle endomorphisms of . Admittedly this is the harder direction! But maybe doing a special case will help you get going.
Puzzle. Any ring is a left module over itself, by its usual multiplication. Show that for any ring , every left -module map
is of the form
for some unique .
By the way, this exercise is one reason we like rings to have identity elements.
Using this, show that for any manifold , all endomorphisms of the -module of sections of the trivial line bundle come from vector bundle endomorphisms!
You've hinted the identity element plays a role so I will make use of that! Let's say we have a module map p: R -> R, we will then consider p(idR) where idR is the multiplicative identity of the ring. Since p, as a homomorphism, must obey the condition p(ar) = a p(r), this means for some element r, p(r idR) = r p(idR). Since r idR = r, this means p(r) = r p(idR). So it seems that p(idR) plays the role of s in the above. I hope that makes sense!
Yes, that's right! Did you notice we've used this argument before in our conversations?
Not for rings, but....
Well this is a common pattern when working with groups and rings, you could be referring to anything from something we did when discussing torsors to the work we did a while ago to show that a map out of the trivial line bundle is a section.
Yes, I was referring to how all left G-set endomorphisms of G come from right multiplication by an element of G, which we ran into in our work on torsors.
I believe all these things should be spinoffs of the Yoneda Lemma but I'm too lazy to figure out how. The Yoneda Lemma makes crucial use of how categories have identity morphisms.
Let's see if I can finish this line of reasoning for the trivial line bundle. So sections of this is C(M), which is the ring of sections of this trivial bundle. It becomes a C(M) module by multiplication onto itself. Just as we showed with a general ring, this implies that every endomorphism on C(M) as a C(M) module corresponds to multiplication by an element of C(M) (and so multiplication by a section). Meanwhile, back at the trivial line bundle itself, an endomorphism corresponds to, at every fiber (which is just R), scaling the real number line by some real number amount (since a linear map preserves the origin and does uniform scaling and rotating). The amount you scale by at a particular point is itself a real number, and so you can assign a "scale factor" to each fiber (that represents the scaling you have to do at each fiber under the bundle endomorphism) via a section of the bundle itself. A section transformed by the endomorphism is then given by multiplying, at each point, by the scale factor assigned to that point. But this is just multiplication of the scale factor section onto the target section within C(M). Thus, for every endomorphism on the trivial line bundle, we have a scale factor section, and fixing this section for multiplication determines an endomorphism on C(M) given by the endomorphism that sends the multiplicative identity to that section. I think that shows what we want to show- that every endomorphism on the trivial line bundle gives rise to an endomorphism on C(M)!
Of course, still a lot of work to do to show this generalizes, and then to go the opposite way!
Great, yes that's how it works for a trivial line bundle!
Next you can try a finite direct sum of trivial line bundles, and use: every endomorphism of the left -module is given by right multiplication by some matrix with entries in .
After that you can use the fact that every vector bundle is locally isomorphic to a finite direct sum of trivial line bundles.
then to go the opposite way!
Turning a vector bundle endomorphism into an endomorphism of its module of sections is much easier: I gave the formula for how it works, and you can easily check this gives a module endomorphism.
Checking that the two processes are inverses... this is a mopping-up operation. :upside_down:
John Baez said:
Next you can try a finite direct sum of trivial line bundles, and use: every endomorphism of the left R-module Rn is given by right multiplication by some n×n matrix with entries in R.
I'm confused by this a lot. In the case where we had a ring that was a module over itself, we could show that an element of the ring corresponded to an endomorphism on the ring. This was helpful in the trivial line bundle case since we could represent the "scale factor" as a section of the same bundle, and thus as an element of the resulting C(B) ring that was a module over itself. Thus, endomorphism on trivial line bundle <-> scale factor <-> element of C(B) <-> endomorphism on C(B) as module over itself. However, for a general vector bundle E, there's a break in that chain of logic since the scale factors at each point are, as you mention, n x n matrices, and not vectors of E. As such, the "scale factor" is not a section of E and thus not an element of Sections(E) (but rather is a section of End(E)!) Essentially, I'm asking for your help in repairing that chain above so it reads something like: endomorphism on bundle E <-> scale factor (n x n matrix field) <-> ? <-> endomorphism on Sections(E).
If the "scale factor" is not an element of Sections(E), then how are we supposed to believe it generates an endomorphism on Sections(E)? I'm guessing because we're not supposed to take Sections(E) as a module over itself, but then I'm confused by why you wanted me to do that exercise in the first place if it only works for the trivial line bundle. If Sections(E) is a module over C(B) as mentioned above, that still doesn't help me much since each element of C(B) is not a matrix but just a real valued function, and while multiplication by a scalar may give an endomorphism on the overall module, it certainly doesn't give every endomorphism!
John Baez said:
Turning a vector bundle endomorphism into an endomorphism of its module of sections is much easier: I gave the formula for how it works, and you can easily check this gives a module endomorphism.
I was trying to show how one could make an endomorphism on Sections(E) from one on the bundle. By the above chain of bijections I think I accidentally did it in reverse, but as you can tell from my remaining confusion above, that was purely accidental and only works for this very very special case of the trivial line bundle.
John Onstead said:
John Baez said:
Next you can try a finite direct sum of trivial line bundles, and use: every endomorphism of the left R-module Rn is given by right multiplication by some n×n matrix with entries in R.
I'm confused by this a lot. In the case where we had a ring that was a module over itself, we could show that an element of the ring corresponded to an endomorphism on the ring. This was helpful in the trivial line bundle case since we could represent the "scale factor" as a section of the same bundle, and thus as an element of the resulting C(B) ring that was a module over itself. Thus, endomorphism on trivial line bundle <-> scale factor <-> element of C(B) <-> endomorphism on C(B) as module over itself. However, for a general vector bundle E, there's a break in that chain of logic since the scale factors at each point are, as you mention, n x n matrices, and not vectors of E.
Indeed, things are different now!
Let me see where we stand. I'm trying to get you to show this:
Claim. For any ring , every endomorphism of the left -module is given by right multiplication by some matrix with entries in .
First of all, do you see why this is a good thing to show? Namely: do you see why, when for a topological space , this will imply that all endomorphisms of the trivial vector bundle over will come from sections of the bundle ?
Second, do you agree that you've already proved the claim when ?
Third, I fully admit that the argument becomes a bit different when we try to handle . I'm not claiming that we can naively copy what we did for . matrices are especially easy to study when - and also especially misleading, since it's hard to tell the difference between an matrix with entries in and an element of in this particular case!
So, when we get to we need some new ideas. But I still think it was helpful to start with .
Yes, I am in agreement with all this. If we can show that every endomorphism on the C(B) module Sections(E) for E some real valued (trivial) vector bundle is given by right multiplication by some n x n matrix with entries in R, then indeed the rest follows. This is because the n x n matrix is a good way to represent a section of End(E), which itself would be an endomorphism on E. The "matrix field" M(x) representing the section of End(E) would act as the "scale factor" for higher dimensions, since it gives a matrix at every point on B that conveys the linear transformation of the endomorphism on that fiber.
I'm not sure if this helps, but we can draw an analogy with RVect. In RVect, the scalars are real numbers and the linear maps are represented as matrices where the entries are filled in with real numbers. If we move to a module over C(B), then the scalars are real valued functions R. So we might then pretend, by analogy, that a linear map between modules over C(B) is represented by a matrix where each entry into the array is a real valued function on B. This gives us an array that looks like [F(x) G(x) ...], but we can also pretend we can move out the x and get [F, G, ...] (x), which then gives us a matrix field over B. Since Sections(E) is a C(B) module, endomorphisms on it are square matrices of real valued functions, each of which corresponds to a matrix field on E (a section of End(E)) and thus an endomorphism of E. This reasoning is convincing but I don't know if it's rigorous or not.
Okay, all your ideas are correct but maybe you don't know how to prove them in complete detail, so let's think about that a lot.
Let me work with a commutative ring to reduce worries, since then I can slack off about worrying about the difference between left and right -modules, and this case covers the case . I feel guilty doing this, since noncommutative rings are great too, but I will suppress my feelings of guilt.
One learns in algebra class that every endomorphism of the -module is described via matrix multiplication by an matrix with entries in .
Am I right that you want to figure out how to prove this? If so, how would you cook up a matrix given a module endomorphism ?
(There's no need for to be a field here: we often use the assumption that is a field to show that every finitely generated -module is isomorphic to some , but we're assuming we've got right from the start!)
John Baez said:
Am I right that you want to figure out how to prove this? If so, how would you cook up a matrix given a module endomorphism f:Rn→Rn?
If we're working with RVect, I would say we can turn an endomorphism R^n -> R^n into a matrix by first selecting a basis of R^n (a set of basis vectors that "span" the space). Then, watch what the endomorphism does to those basis vectors. The matrix then is an array of numbers that tells you where the basis vectors "land" under the action of the endomorphism. Does this approach go beyond RVect? After all it assumes that both vectors and matrices are arrays of things valued in the ring/field that the vector space is defined over, but I'm not sure if this is true in general.
Yes, those are true by definition: for any ring
an element of is a list of elements with each ,
an matrix of elements of is an array where for all , .
Let's follow tradition and write to mean the set of matrices with elements in the ring .
Can you prove that is a left module of via the usual formula for matrices acting on vectors? If this seems too obvious to require proof, I'm okay with that: you just need to be sure that if you were held up at gunpoint one night and your assailant required you to write down a proof, you could quickly do it.
Can you also prove that is a left module of via the usual formula for scalars multiplying vectors?
Can you then prove the interesting thing, namely that all endomorphisms of as a left -module come from left multiplication by some matrix ?
I'm not trying to make you write down long proofs here on Zulip, so if there's any particular steps in the arguments that seem iffy, we can focus on those.
John Baez said:
Yes, those are true by definition: for any ring R
Sure, but can you prove that all modules over R are isomorphic to some R^n? Also, the ring product R x R x R... = R^n isn't a vector space/module since it has a ring multiplication that allows you to "multiply vectors" together, so long as we are interpreting elements of R^n, given by lists of elements (r1, r2, ...) as column vectors [r1 r2 ...]. You would somehow have to restrict the ability to multiply to just scalars.
John Baez said:
Can you prove that Rn is a left module of M(n,R) via the usual formula for matrices acting on vectors?
So in this way, we are making a vector space where the "scalars" are the matrices themselves, and scalar multiplication is the usual matrix multiplication? That's certainly a twist if true! We want to know if maybe we could show that the set of all endomorphisms of R^n as a M(n, R) module generated by scalar multiplication corresponded bijectively to all endomorphisms of R^n as a R module.
John Baez said:
Can you then prove the interesting thing, namely that all endomorphisms of Rn as a left R-module come from left multiplication by some matrix T∈M(n,R)?
Here's how I would do it. If we select a basis in R^n given by {u1, u2, ...} then any vector in R^n is given by v = a1u1 + a2v2 +... where a1, a2, ... are scalars. A linear transformation respects scalar products, so T(v) = T(a1u1 + a2v2 + ...) = a1T(u1) + a2T(u2) + ... by linearity. You can then decompose T(u1) into basis vectors as follows: T(u1) = b1u1 + b2u2 + ... and do this for every term in the original expansion. You then get T(v) = a1b1u1 + a1b2u2 + ... The end result expansion is precisely the result you would expect of a matrix multiplication!
John Onstead said:
John Baez said:
Yes, those are true by definition: for any ring R
Sure, but can you prove that all modules over R are isomorphic to some R^n?
No, because it's not true, even for finitely generated modules. I mentioned that this is what assuming R is a field buys you:
John Baez said:
[....] we often use the assumption that is a field to show that every finitely generated -module is isomorphic to some , but we're assuming we've got right from the start!)
When we learn algebra, first we work with fields, which buys us the simplification that all modules over a field are [[free|free modules]], so all morphisms between them are described by matrices. Then we grow up a bit and work with rings, or at least commutative rings, where things get a lot more interesting!
For example if is the ring of polynomials in with real coefficients, there's a module where a polynomial acts on a real number as follows:
At right we're just multiplying two real numbers. Check the key module law:
This module is not isomorphic to even though it's finitely generated.
John Baez said:
When we learn algebra, first we work with fields, which buys us the simplification that all modules over a field are [[free|free modules]], so all morphisms between them are described by matrices. Then we grow up a bit and work with rings, or at least commutative rings, where things get a lot more interesting!
Interesting, so you can only express endomorphisms of a module over a ring as a matrix if, and only if, it's a free module! That's an interesting fact! That means, as long as for vector bundle V, Sections(V) as a C(B) module is a free module over C(B) in all cases, then endomorphisms on Sections(V) as a C(B) module are real function valued matrices. This then implies that curvature, which is an endomorphism of Sections(V) as a C(B) module as we showed above, also then corresponds to a section of End(V), which is what we wanted to show. So as long as Sections(V) is a free C(B) module, I think we're done here!
I think we've gone quite off track from our original discussion on gauge theory. I do wish to get back to that, but there's actually a few other things I wanted to clarify, such as how to approach some constructions we've mentioned (such as the associated bundle one) from a categorical perspective. After all, we are having this discussion on a CT server! I'll gather my thoughts and come back in an hour or two with our next digression so long as you're up for it! Also one last thing, thanks for your patience! For some reason this whole "curvature as a section of End(V)" took a while to "click" for me. Thanks for your help through it!
Okay, sure!
I can't resist saying this: since every vector bundle over a space is locally trivializable each point has a neighborhood where the sections of the bundle pulled back (we say 'restricted to') to form a finitely generated free module of . Then we can use all the math we've been talking about, like treating endomorphisms of restricted to as matrix-valued functions.
But sections of over the whole space are not a free module.
We can deal with this by working locally and piecing together the result, or by using the [[Serre-Swan theorem]], which says that under various reasonable conditions the module of sections of a vector bundle will form the next best thing to a free module, which is called a [[projective module]].
We probably don't want to get into this here, but one way to define a projective module is that it's a summand of a free module. In other words, you can take the coproduct of your module with some other module, and get a free module. (This is Lemma 2.3 in the page [[projective module]].)
Thus, with some effort, a lot of tricks that work for free modules can be generalized to projective modules. So this is good.
Also, I think you've heard about 'projective resolutions'. So, there's a nice interaction between what we're doing now and our conversation about resolutions.
John Baez said:
But sections of E over the whole space B are not a free module.
You've certainly dampened my enthusiasm over my realization with this one! Presumably, even if E isn't globally trivial, you can still define a curvature R(X,Y) as a section of End(E), right? So even if the bundle isn't globally trivial and an endomorphism on Sections(E) isn't necessarily a matrix, then why does this correspondence still magically hold? If it has something to do with the behavior of projective modules I'm open to learning more about them!
This realization was foreshadowed yesterday when I said:
John Baez said:
Great, yes that's how it works for a trivial line bundle!
Next you can try a finite direct sum of trivial line bundles, and use: every endomorphism of the left -module is given by right multiplication by some matrix with entries in .
After that you can use the fact that every vector bundle is locally isomorphic to a finite direct sum of trivial line bundles.
So I was suggesting a three-step process: first understand endomorphisms of trivial line bundles, then understand them for direct sums of trivial line bundles (aka trivial vector bundles), and then understand them for all vector bundles.
The first step was just pedagogical, as a warmup for the second. The second step was more substantial. The third step is an exercise in "proving things globally by working locally".
John Baez said:
The first step was just pedagogical, as a warmup for the second. The second step was more substantial. The third step is an exercise in "proving things globally by working locally".
I see, thanks for the clarification. This learning style was certainly interesting, but it's not how I usually approach things. Things make the most sense if I approach them from the most general angle first and then specialize from there, not the other way around. That is, it's easier for me mentally to break things down than to build up on things! But encountering new perspectives is one of the reason why I enjoy this server so much!
I do have a remaining question about this, which is why I never find this kind of analysis in the resources I read to learn more about curvature and bundles. These resources, your book included, go to great depths to prove to the readers and convince them that the curvature operator R(X,Y) acts tensorially- that is, satisfies R(X,Y)(fs) = f R(X,Y)(s). But then, they stop just short of completing the proof when they, instead of proving anything, then boldly assert that this tensorial property means the curvature is a section of End(E). Sure, a full blown digression into projective modules, the Serre-Swan theorem, and all that would be necessary to make this assertion rigorous might detract from what the authors are trying to write about. But it's still frustrating to me when I like to learn more about why something is the way it is, rather than that it is the way it is! I think authors should consider at least leaving a note or something that directs interested readers to further resources whenever they state that something is true if they opt to not explain the why. Again, this is just a general frustration I wanted to convey, this is not in any way a criticism of your book in particular- in fact, I'm still enjoying going through it! I'm currently on the Yang-Mills equation section, and it's very interesting so far!
Things make the most sense if I approach them from the most general angle first and then specialize from there, not the other way around.
Well then we should be studying -categories before doing anything else! :laughing:
Tackling a 1-dimensional trivial line bundle before doing an arbitrary trivial vector bundle was just to get your brain to transition to thinking about modules and their endomorphisms. But doing a trivial vector bundle before doing an arbitrary vector bundle is pretty much essential, since remember, the definition of vector bundle involves the phrase locally trivial! So we need to reduce most questions about vector bundles to the case of trivial bundles.
I do have a remaining question about this, which is why I never find this kind of analysis in the resources I read to learn more about curvature and bundles. These resources, your book included, go to great depths to prove to the readers and convince them that the curvature operator R(X,Y) acts tensorially- that is, satisfies R(X,Y)(fs) = f R(X,Y)(s). But then, they stop just short of completing the proof when they, instead of proving anything, then boldly assert that this tensorial property means the curvature is a section of End(E).
It's actually like a bunch of other facts physicists use, like this: a function is zero if for all functions . It's "obvious", so they never bother to prove it, but it actually take a little work to prove - and in that case, unlike the case we're doing, it takes some extra hypotheses to make it actually true.
Proving that fact requires a 'boring' digression into measure theory, just as now we're engaged in a 'boring' digression into the theory of vector bundles and their endomorphisms. It's not really boring - but if someone is trying to understand the Standard Model and come up with their own new theory of particle physics, they may be willing to take some facts for granted and move on.
John Baez said:
Well then we should be studying ∞-categories before doing anything else!
In an interesting twist, I actually learned about enriched categories while I was learning about ordinary categories!
John Baez said:
Proving that fact requires a 'boring' digression into measure theory, just as now we're engaged in a 'boring' digression into the theory of vector bundles and their endomorphisms. It's not really boring - but if someone is trying to understand the Standard Model and come up with their own new theory of particle physics, they may be willing to take some facts for granted and move on.
This makes sense, I'll certainly keep this in mind when engaging a resource to learn some new concept in math or physics!
I'll take a break for a bit to go over my notes, I might come back a bit later once I figure out where to go next!
As I mentioned before, before we get back on track to gauge theory, I'd like to clarify something I forgot to ask much earlier. We are often told that vector bundles over the point are just vector spaces. But on further thought I've found this inherently wrong. First, we recall that a vector bundle over X is a vector space object V in Top/X, when V is a locally trivializable bundle. So a vector bundle over the point * would be such a vector space object in Top/* . But the problem is Top/* ~ Top, therefore it's really just a vector space object in Top. But this isn't a vector space unless you are focusing on the discrete topological spaces (in which case it is a vector space, since that part of Top is basically just Set). Instead, it's a topological vector space! So is it true to say that it's wrong that vector bundles over the point are vector spaces, and that we should say that vector bundles over points are topological vector spaces instead?
You can say it, but when people say "vector bundle" we mean "finite-dimensional real or complex vector bundle" unless we state otherwise, and we use the equivalences
to regard our finite-dimensional real or complex vector spaces as finite-dimensional topological real or complex vector spaces. Since they're equivalences, this is no big deal.
Puzzle. Show that for a finite-dimensional real or complex vector space there's a unique topology such that the vector space operations (say addition and scalar multiplication , where or ) are continuous. Show that choosing this topology gives functors
Show that these functors are equivalences, with the forgetful functors
providing the equivalences going the other way.
For this reason nobody ever talks about a "topological vector space" if they're dealing with finite-dimensional vector space: every finite-dimensional vector space is automatically a topological vector space in a unique way.
Now, when we consider infinite-dimensional vector spaces and vector bundles, the situation changes completely. There are many different ways to make an infinite-dimensional vector space into a topological vector space! So then there is a lot to talk about. But it's a big subject and we should only get into it if we are tired of talking about other things.
I'm sorry, my statements above are false unless everywhere I said "topology" we replace that with "Hausdorff topology".
That is: people typically require a "topological vector space" to be a vector space with the structure of a Hausdorff space such that the vector space operations are continuous. With this extra condition of Hausdorffness, we get an equivalence between finite-dimensional real (or complex) vector spaces and finite-dimensional real (or complex) topological vector spaces.
Here's why we need this extra condition. If we don't, we can give the indiscrete topology and it becomes a topological vector space in a second way!
So:
Revised Puzzle. Show that for a finite-dimensional real or complex vector space there's a unique Hausdorff topology such that the vector space operations (say addition and scalar multiplication , where or ) are continuous. Show that choosing this topology gives a functor
where is the category of finite-dimensional vector spaces over with a Hausdorff topology such that vector space operations are continuous, and continuous linear maps between these.
Show that this functor is an equivalence, with the forgetful functor
providing the equivalences going the other way.
I believe the only hard part about this puzzle is the uniqueness claim; all the category-theoretic stuff should be an easy spinoff of that.
Well, a topological vector space can arise from a vector space by equipping it with a norm. This norm induces a metric which in turn induces a Hausdorff topology. So you have to show two things: that any two norms on a finite dimensional vector space give the same topology (and that linear maps between finite vector spaces preserve this topology), and that the only topology on a finite dimensional vector space is given by the norm one. Neither of these are trivial for someone inexperienced in analysis and topology like me- remember, I've never taken these classes in college! However, there are some good proofs I found online. My favorite one is this paper, which explains both of these to a sufficient degree. I'd probably have to understand more topology to fully understand what is going on, but I can loosely follow what this proof is doing.
With all that out of the way, let's consider the category theory part. TopFinVect -> FinVect is the obvious forgetful functor that forgets the topology. From the first proof that any two norms on a finite dimensional vector space give the same topology, we know It has an adjoint free functor which freely equips FinVect with a topology given through the norm. The second proof allows us to strengthen this free/forgetful adjunction into an equivalence by showing that this free structure is in fact the unique "extra structure" you can equip an object in FinVect with. This means that starting either in TopFinVect or FinVect, you can add and remove the topological structure, or remove and add the topological structure, and end up (isomorphic to) where you started, given everything is unique (up to isomorphism) in both directions.
John Onstead said:
Well, a topological vector space can arise from a vector space by equipping it with a norm. This norm induces a metric which in turn induces a Hausdorff topology. So you have to show two things: that any two norms on a finite dimensional vector space give the same topology (and that linear maps between finite vector spaces preserve this topology), and that the only topology on a finite dimensional vector space is given by the norm one. Neither of these are trivial for someone inexperienced in analysis and topology like me- remember, I've never taken these classes in college! However, there are some good proofs I found online. My favorite one is this paper, which explains both of these to a sufficient degree.
That's nice. As I was reading this, I was thinking "bringing a norm into it seems distracting: we should be able to prove directly that any Hausdorff topological vector space structure on a finite-dimensoonal real or complex vector space should be homeomorphic to the usual one". And that's actually what this lecture does: it proves the result without mentioning a norm. And it calls it the Tychonoff Theorem (Theorem 3.14). I never knew it had a name.
The proof is not trivial, even for someone like me who regularly teaches real analysis.
I'm more used to proving that any two norms on a finite-dimensional real or complex vector space induce the same topology, since I used to have to prove this almost every year while teaching the third quarter of the real analysis qualifier course.
It suffices to show that if your two norms and on , the ratio of norms
is bounded, since then for some we have
and by the same argument
so inside any open neighborhood of the origin in one topology you can fit an open neighborhood from the other topology, so (by translation-invariance) the two topologies agree.
So, the fun part is getting the bound on
I leave this to the interested reader.
This is very interesting! I hope to get into learning analysis at some point, but I've heard it's rather difficult. In fact, it's usually cited as the hardest math class at many different universities!
That said I should probably mention why I bring up vector bundles over a point: it's because I'm curious about the properties of the pullback. Given objects A and B and a morphism f: A -> B, if the category you are in has pullbacks, then there will be a "base change" functor C/B -> C/A derived from pulling back along f. It has two adjoints: the dependent product and coproduct. There's also the usual composition functor C/A -> C/B, but this isn't a part of this adjunction family. The important thing is that given the terminal object *, the base change along the unique morphism A -> * gives a functor C/* -> C/A that sends an object in C/* (so basically an object in C) to the projection morphism C x A -> A in C/A.
Now, here's what I want to know. Let's say that S is a structure preserved by pullback, meaning if you equip a morphism with this structure, pulling back gives you another morphism that can also be equipped with some structure S determined from the pullback. An example is vector bundle structure- given a vector bundle, the pullback bundle is also a vector bundle, so we will say that pullbacks preserve vector bundle structure. Now, let S(A) be the category of S structure internal to C/A. My hypothesis is that if S is preserved under pullback, then the base change C/B -> C/A "lifts" to a base change S(B) -> S(A). Furthermore, if we have a terminal object, the product functor C/* -> C/A "lifts" to a functor S(C) -> S(A). An example of the latter case would be vector bundles where S = RVect. I'm essentially proposing there exists a functor from real topological vector spaces (the objects of RVect(Top); this is why I wanted to make sure that vector bundles over the point were topological vector spaces in my previous question) to real vector bundles over A via RVect(*) ~ RVect(Top) -> RVect(A). In particular, the topological field R forms a topological vector space over itself; this object in RVect(Top) will be sent by this base change functor precisely to the trivial line bundle R x A -> A over A in RVect(A), already equipped with all its vector space structure. So my question is: does all of what I said here hold true? If so, it implies a very general result: given any structure S preserved by pullback, you can take a "structured product" of S (an object of C with extra structure) with A (an object of C without extra structure) to get an object in S(A) whose underlying morphism is of the form S x A -> A.
John Onstead said:
This is very interesting! I hope to get into learning analysis at some point, but I've heard it's rather difficult. In fact, it's usually cited as the hardest math class at many different universities!
It's a matter of personal temperament. As an undergrad, since I did physics and a lot of analysis was developed for the purposes of physics, it all seemed very natural and fun. That's why I did my Ph.D with the analyst Irving Segal on trying to make quantum field theory mathematically rigorous. Back then algebra seemed the hardest - because I relied heavily on visualizing things, and I could neither visualize things like Sylow's theorem, nor see what use they were for physics. I got tenure for my work on nonlinear PDE, and only later did I get interested in category theory and other forms of algebra.
John Onstead said:
Now, here's what I want to know. Let's say that S is a structure preserved by pullback, meaning if you equip a morphism with this structure, pulling back gives you another morphism that can also be equipped with some structure S determined from the pullback. An example is vector bundle structure - given a vector bundle, the pullback bundle is also a vector bundle, so we will say that pullbacks preserve vector bundle structure. Now, let S(A) be the category of S structure internal to C/A. My hypothesis is that if S is preserved under pullback, then the base change C/B -> C/A "lifts" to a base change S(B) -> S(A).
I'm not sure what "S is preserved by pullback" means except perhaps something like "the base change C/B -> C/A lifts to a base change S(B) -> S(A)."
I suspect that someone who loves fibrations might do better at this question than me - that is, turning the question into some juicy theorem.
John Baez said:
I suspect that someone who loves fibrations might do better at this question than me - that is, turning the question into some juicy theorem.
I have an idea for how something like this might be done. The codomain fibration is a fibration from the arrow category Arr(C) -> C given C with pullbacks. If you "reverse engineer" this fibration into a pseudofunctor via the Grothendieck construction, you get the pullback functor C^op -> Cat that sends A in C to C/A and the morphisms in C^op to the pullback base change functors. The problem then becomes what happens when I have some structure S I want to put on the objects of the slice categories and thus the arrow category. For instance, S could be a Lawvere theory, in which case I want to investigate the category of models of this Lawvere theory in Arr(C). This gives a category Mod(Arr(C)) with a forgetful functor Mod(Arr(C)) -> Arr(C). I think the proof would then be to somehow determine all cases where the composition of this forgetful functor with the codomain fibration Arr(C) -> C is also a fibration (or, at the very least, that there exists a relevant fibration Mod(Arr(C)) -> C at all). This would give a fibration Mod(Arr(C)) -> C that can be "reverse engineered" to give a pseudofunctor C^op -> Cat; we would then have to show this indeed sends every object to Mod(C/A), and where the functors between them are the "structured" base changes I'm thinking of.
I found a few more resources. At the nlab they actually use the case of vector bundles as the first example of a fibration! They also have an article for what it means for a property to be stable under pullback. In this case, it's clear vector bundle structure is stable under pullback, and any structure stable under pullback will give rise to a fibration since the lifting is possible.
Maybe we should put this issue aside for now, but I have one last thought. A finite product theory in C/A corresponds to "picking out" some products and diagrams in C/A, but that in turn corresponds to picking out pullbacks in C itself. If pullbacks are always "stable" under other pullbacks, then theoretically any finite product theory should give a fibration. I have no idea if I'm even making sense right now but hopefully you get what I mean!
John Onstead said:
John Baez said:
Yes, those are true by definition: for any ring R
Sure, but can you prove that all modules over R are isomorphic to some R^n? Also, the ring product R x R x R... = R^n isn't a vector space/module since it has a ring multiplication that allows you to "multiply vectors" together...
Since John didn't pick you up on this, I wanted to say that this is the first time I've seen someone claim that something failed to satisfy a definition because it has too much structure ;) R^n is definitely an R-module, but you've picked up on the fact it's actually an R-algebra
@Morgan Rogers (he/him) - thanks for noticing that!
I believe when @John Onstead sees he thinks "ring", so that must mean the product of copies of this ring in the category of rings - so it can't be an -module. I'm always struggling to get him to accept that mathematicians don't stay so rigidly fixed in one category. I see and think it's a ring, so it's a left -module, it's a right -module, it's an -bimodule, it's an abelian group, it's a set, if it's a commutative ring it's also an -algebra, etc. - and it's my job to figure out which interpretation is correct for the statement being uttered. When mathematicians write they realize that a bunch of forgetful functors going between these categories preserve products so the ambiguity is benign.
John Baez said:
I believe when @John Onstead sees R he thinks "ring", so that Rn must mean the product of n copies of this ring in the category of rings - so it can't be an R-module. I'm always struggling to get him to accept that mathematicians don't stay so rigidly fixed in one category
When doing a subsequent exercise to that one I learned what R^n actually meant. It's not the nth product of R as a ring, it's the nth product of R as a module over itself!
I considered that clear by the time you said
"Sure, but can you prove that all modules over R are isomorphic to some R^n?"
Since you're talking about isomorphism of -modules here, I assumed you meant using the product in the category of R-modules.
The next sentence is thus warning me against something I'd never have dreamt of:
Also, the ring product R x R x R... = R^n isn't a vector space/module since it has a ring multiplication that allows you to "multiply vectors" together...
In any event, most mathematicians use ambiguously in the manner I described, to mean the -fold product of the object we call , which is an object in a whole bunch of different categories that are defined by different Lawvere theories - rings, left -module, right -modules, -bimodules, abelian groups, and sets. And the ambiguity is harmless since all these categories are related by right adjoints, which preserve products, and send in one of these categories to in another.
Actually there is something I'm still confused about here. On the wikipedia page for free modules, they state that R^n isn't the nth product of R with itself, but rather the nth "direct sum" of R with itself. I guess that's fine if the direct sum is the categorical product within R-mod, but I just wanted to clarify!
A category of -modules always has finite biproducts, so finite products are 'the same' as finite coproducts - you can use the same object for both! Ordinary mortals call biproducts of -modules direct sums.
E.g. the 'direct sum' of finitely many vector spaces is both their product and coproduct.
(Having spent 20 years learning these conventions I seem to have forgotten how confusing they can be at first. It's even worse among physicists, who use phrases like 'direct sum', 'direct product', 'tensor product' and 'Kronecker product' in a quite undisciplined way.)
We can circle back to the chain of thought about theories and Grothendieck constructions above eventually; I'll probably open up a separate thread for it. For the time being here, I want to ease back into gauge theory. My question is: given E is an associated vector bundle for a principal G-bundle P, does this automatically mean P is (isomorphic to) its frame bundle? My instinct is that it doesn't make sense for it not to be true, otherwise it wouldn't always be possible to "transfer" concepts such as connection and curvature from a principal bundle to an associated bundle. If all associated vector bundles of P have P as their frame bundle, then it means it's always possible to take a connection or curvature section of P (given by a section of Ad(P) x T* M or Ad(P) x ext^2 T* M respectively) and "transfer" it to a connection or curvature section of E (given by a section of End(E) x T* M or End(E) x ext^2 T* M respectively) via the frame bundle isomorphism End(E) ~ Ad(P). But on further thought, wouldn't that imply that multiple non-isomorphic vector bundles have isomorphic End(E)? (if they share the same P as a frame bundle, they share the same Ad(P), and thus End(E)) That doesn't seem to make much sense either. So which is the correct option?
John Onstead said:
My question is: given E is an associated vector bundle for a principal G-bundle P, does this automatically mean P is (isomorphic to) its frame bundle? My instinct is that it doesn't make sense for it not to be true [....] But on further thought, wouldn't that imply that multiple non-isomorphic vector bundles have isomorphic End(E)? (if they share the same P as a frame bundle, they share the same Ad(P), and thus End(E)) That doesn't seem to make much sense either. So which is the correct option?
It can't be true - since the frame bundle of a vector bundle is always a principal GL(n)-bundle, and you started with a principal G-bundle where typically G GL(n). The groups aren't even the same so the frame bundle can't be our original principal bundle!
Having a concrete examples under your belt will let you instantly disprove lots of conjectures about how gauge theory works.
Consider the strong force, where we start with G = SU(3) and describe gluons using the adjoint representation, say , of this group on its Lie algebra, which as a vector space is isomorphic to . (We thus say there are 8 kinds of gluon.) Starting from any principal G-bundle P we get an associated bundle
This is an 8-dimensional complex vector bundle, so taking its frame bundle we get a principal -bundle.
Our group has changed from SU(3) to . The relation between these groups is that our representation is a Lie group homomorphism
Going back to what led you into this question: the representation of G provides all the necessary 'glue' to transfer concepts like connection, curvature etc. from the principal G-bundle P to its various associated bundles. For these purposes it's not even helpful to bring in the frame bundle of an associated bundle!
The frame bundle is mainly interesting if someone hands you nothing but a vector bundle and forces you to provide them with a principal bundle that it's associated to.
For example, say someone hands you nothing but a manifold: you can form its tangent bundle, and then the frame bundle of that.
John Baez said:
Going back to what led you into this question: the representation of G provides all the necessary 'glue' to transfer concepts like connection, curvature etc. from the principal G-bundle P to its various associated bundles. For these purposes it's not even helpful to bring in the frame bundle of an associated bundle!
But then my question becomes, what is the explicit formula that tells you how to start with a section of Ad(P) x T* M and get to the corresponding End(E) x T* M for a vector bundle E associated to P?
I'll give this a try but I don't know if it's right. Above you mention that there always exists a Lie group homomorphism between the Lie group of P and the Lie group of the principal bundle you get when you first make an associated vector bundle, and then take its frame bundle (which we will call P'). This should extend to a bundle homomorphism P -> P' in all cases. Since the associated bundle construction is functorial, it means taking the adjoint bundle is also functorial; we can apply this to get a morphism Ad(P) -> Ad(P'). The tensor product is also functorial, so this in turn extends to a morphism Ad(P) x T* M -> Ad(P') x T* M. The connection on P is a section X -> Ad(P) x T* M; we can then take the composition X -> Ad(P) x T* M -> Ad(P') x T* M to get a morphism X -> Ad(P') x T* M. Now, we know Ad(P') ~ End(E) by the frame bundle isomorphism, so this gives us a morphism X -> End(E) x T* M, which seems to be what we want. Is what I did here anywhere close to correct?
John Onstead said:
John Baez said:
Going back to what led you into this question: the representation of G provides all the necessary 'glue' to transfer concepts like connection, curvature etc. from the principal G-bundle P to its various associated bundles. For these purposes it's not even helpful to bring in the frame bundle of an associated bundle!
But then my question becomes, what is the explicit formula that tells you how to start with a section of Ad(P) x T* M and get to the corresponding End(E) x T* M for a vector bundle E associated to P?
I'll give this a try but I don't know if it's right.
Great problem!
Above you mention that there always exists a Lie group homomorphism between the Lie group of P and the Lie group of the principal bundle you get when you first make an associated vector bundle, and then take its frame bundle (which we will call P').
Usually nobody talks about this second frame bundle, so I'm curious to see what you do with it.
This should extend to a bundle homomorphism P -> P' in all cases. Since the associated bundle construction is functorial,
People usually consider maps between principal G-bundles, but here you are have a map from a principal G-bundle to a principal G-bundle 'riding' the group homomorphism . So I agree with you about functoriality - but at some point let's carefully consider which category this functor comes out of.
More later.
Okay, I'm back.
what is the explicit formula that tells you how to start with a section of Ad(P) x T* M and get to the corresponding End(E) x T* M for a vector bundle E associated to P?
First, just a reminder: a connection is not a section of Ad(P) T* M. A difference of connections is such a section. Luckily, any trivialized G-bundle P M comes with a god-given connection, the trivial connection . So for any other connection on this bundle, we can describe the difference as a section of Ad(P) T* M.
An analogous story applies to connections on vector bundles and sections of End(E) T* M: these describe differences of connections. This is explained in my book.
That said, it's still interesting to see how a section of Ad(P) T* M and a homomorphism : G GL(V) determine a section of End(E) T* M, where E is the bundle associated to P via the representation of G on some vector space V.
Above you mention that there always exists a Lie group homomorphism between the Lie group of P and the Lie group of the principal bundle you get when you first make an associated vector bundle, and then take its frame bundle (which we will call P'). This should extend to a bundle homomorphism P -> P' in all cases. Since the associated bundle construction is functorial, it means taking the adjoint bundle is also functorial; we can apply this to get a morphism Ad(P) -> Ad(P'). The tensor product is also functorial, so this in turn extends to a morphism Ad(P) x T* M -> Ad(P') x T* M. The connection on P is a section X -> Ad(P) x T* M; we can then take the composition X -> Ad(P) x T* M -> Ad(P') x T* M to get a morphism X -> Ad(P') x T* M. Now, we know Ad(P') ~ End(E) by the frame bundle isomorphism, so this gives us a morphism X -> End(E) x T* M, which seems to be what we want. Is what I did here anywhere close to correct?
It seems on the right track, though superficially completely unlike how I would do it.
I can't resist a technical nitpick:
Writing a section of Ad(P) T* M as X -> Ad(P) x T* M seems peculiar since the section is actually a kind of map
X: M Ad(P) T* M
Maybe that's what you meant?
You can probably fix this with ease.
Here's how I would do things. I've got a section of Ad(P) T* M and I want a section of End(E) T* M. This makes me want a map from Ad(P) to End(E).
Ad(P) is the vector bundle associated to P via the representation Ad: G GL(Lie(G)), where Lie(G) is the Lie algebra of G.
E is the vector bundle associated to P via some representation : G V.
End(E) is another vector bundle, and it must be associated to P via some other representation of G. Indeed it is: there's a representation : G \to GL(End(V)) that does the job. (Since G acts on V it acts on anything built from V, like End(V).)
So: I want a map from the vector bundle associated to P via the representation
Ad: G GL(Lie(G))
to the vector bundle associated to P via the representation
: G GL(End(V))
So I'd think about the general question: if I have two vector bundles associated to the same principal G-bundle, coming from two different representations of G, when can I get a map between these vector bundles?
John Baez said:
So I'd think about the general question: if I have two vector bundles associated to the same principal G-bundle, coming from two different representations of G, when can I get a map between these vector bundles?
I think I saw in your book, there exist a homomorphism between associated bundles when there is a "G-equivariant" map between spaces F. Now, the associated bundle functor (-) xG (-): PrinG x SpaceG -> VectBund becomes a functor P xG (-): SpaceG -> VectBund when you fix a principal G-bundle. The category SpaceG has G-equivariant maps between spaces. So if there's a G-equivariant map, the functor will send it to a bundle homomorphism between the resulting associated vector bundles.
So you'd need to find a G-equivariant map between Lie(G) and V. I'm not sure if there's a canonical one though.
I don't remember saying anything about this in my book, since I never talk about principal bundles or associated bundles!
However, your idea is an excellent one: to get a map between two vector bundles associated to the same principal G-bundle, what we need is exactly an equivariant map between two vector spaces on which G acts!
Given a principal G-bundle P over M, the 'associated bundle' trick gives a functor from Rep(G), the category of representations of G on vector spaces and equivariant linear maps between these, to Vect(M), the category of vector bundles over M and vector bundle maps between these.
So you'd need to find a G-equivariant map between Lie(G) and V. I'm not sure if there's a canonical one though.
There's certainly nothing like that: V is just a random representation of G, there's no interesting way to turn Lie algebra elements into elements of this vector space. (You can send them all to zero. :crazy:)
But didn't you want a G-equivariant map from Lie(G) to End(V)?
By the way, I made some related slips myself earlier in this thread, and I just corrected them. For example, I should have said this:
So: I want a map from the vector bundle associated to P via the representation
Ad: G GL(Lie(G))
to the vector bundle associated to P via the representation
: G GL(End(V))
where GL of any vector space is its Lie group of invertible linear transformations.
John Baez said:
There's certainly nothing like that: V is just a random representation of G, there's no interesting way to turn Lie algebra elements into elements of this vector space. (You can send them all to zero. :crazy:)
Right, but I'm trying to find out, out of all possible bundle morphisms Ad(P) -> End(E), which one is "the" morphism I need to use to "transfer" the connection and curvature over. After all, there's only one corresponding notion of curvature in an associated bundle for some notion of curvature in a principal bundle, not one for every possible equivariant map between the adjoint representation and the target representation!
Right. In my last two posts I've tried to make it clear that there's a god-given obvious vector bundle morphism from Ad(P) to End(E)... without actually telling you what it is, just giving you the means to find it.
John Baez said:
Right. In my last two posts I've tried to make it clear that there's a god-given obvious vector bundle morphism from Ad(P) to End(E)... without actually telling you what it is, just giving you the means to find it.
You'll have to tell me, I haven't been able to figure it out.
In the case where P is the frame bundle of E, it's easy. This is because the group of P is the group of invertible n x n matrices GL(n) of E's fibers. The corresponding lie algebra is the set of ALL n x n matrices. Obviously these are in one to one correspondence to linear transformations on E's fibers, which explains why Ad(P) ~ End(E). But if P isn't the frame bundle then there's not so much an obvious matrix trick to get this to work!
Okay. I said
I want a map from the vector bundle associated to P via the representation
Ad: G GL(Lie(G))
to the vector bundle associated to P via the representation
: G GL(End(V))
We'll get such a thing from a G-equivariant linear map from Lie(G) to End(V), thanks to this fact:
Given a principal G-bundle P over M, the 'associated bundle' trick gives a functor from Rep(G), the category of representations of G on vector spaces and equivariant linear maps between these, to Vect(M), the category of vector bundles over M and vector bundle maps between these.
So what's the G-equivariant map from Lie(G) to End(V)?
Any representation of a Lie group G on a vector space V gives a representation of the Lie algebra Lie(G) on V, and that's the map we want.
John Baez said:
Any representation of a Lie group G on a vector space V gives a representation of the Lie algebra Lie(G) on V, and that's the map we want.
Ok, I think I see what you mean. A representation of a Lie group G on a vector space V is a morphism G -> Aut(V). You're saying there's a way to take this morphism and convert it into a morphism Lie(G) -> End(V). Then you can apply the associated bundle functor for a principal G-bundle P to get a morphism Ad(P) -> End(E) and can then transfer the connection and curvature along that via composition. Hopefully I got all the steps right!
That's right! Some nuances:
I'll remind you that unlike the curvature, a connection is not a section of some bundle. Only a difference of connections is, so the procedure as you described only lets us transfer the difference of two connections from the principal bundle P to the vector bundle E. But it works fine for transferring curvature.
To get everything to work, we need to check the linear map Lie(G) End(V) arising from a Lie group homomorphism G GL(V) is actually G-equivariant.
John Baez said:
I'll remind you that unlike the curvature, a connection is not a section of some bundle. Only a difference of connections is, so the procedure as you described only lets us transfer the difference of two connections from the principal bundle P to the vector bundle E.
I'm a little confused by this. Connections being a difference makes sense for principal G-bundles since they are "made of" torsors. But by the time you get to vector bundles, such as in the associated bundle construction, you already needed to have chosen a trivialization for the principal bundle. Indeed, while one can define multiple different "connection differences" for a principal G-bundle, there's only ever one corresponding notion of "covariant derivative" for the associated bundle. It doesn't make sense to define the covariant derivative as a "connection difference", how can a differential operator be a difference of something?
John Baez said:
To get everything to work, we need to check the linear map Lie(G) → End(V) arising from a Lie group homomorphism G → GL(V) is actually G-equivariant.
I thought about this and I'm wondering if there's some sort of universal property. Like, the morphism G -> GL(Lie(G)) satisfies a universal property within the category of G representations. Let's say you have GRep and the category TopVect (or SmoothVect, whatever is needed). There's the forgetful functor U: GRep -> TopVect that sends a G-representation of a topological vector space to its underlying topological vector space. If this functor has an adjoint, a free functor of some sort, this will automatically send the morphism Lie(G) -> End(V) in TopVect to our G-equivariant morphism in GRep Ad -> p'. In a sense we can use this to instantly skip all the mystery around how one can lift the morphism Lie(G) -> End(V), which exists in the category of vector spaces, to the morphism Ad -> p', which exists in the category of G-representations. If only it existed, which of course I'm not sure of. But it would sure make all our lives a lot easier if it did!
John Onstead said:
In a sense we can use this to instantly skip all the mystery around how one can lift the morphism Lie(G) -> End(V), which exists in the category of vector spaces, to the morphism Ad -> p', which exists in the category of G-representations. If only it existed, which of course I'm not sure of. But it would sure make all our lives a lot easier if it did!
I think this will just have to remain one of math's mysteries. I tried my hardest to find the adjunction but all I could get were resources on "Tannakian duality". There were enough resources on categories of representations that for sure an adjunction would have been mentioned if one existed.
I'm a little confused by this. Connections being a difference makes sense for principal G-bundles since they are "made of" torsors. But by the time you get to vector bundles, such as in the associated bundle construction, you already needed to have chosen a trivialization for the principal bundle.
What does that mean? We certainly don't need to choose a trivialization of a principal bundle to define its associated bundles. And we certainly don't need a trivialization of a principal bundle or a vector bundle to define connections on such a bundle.
It's only treating a connection (which is a first-order differential operator) as a section of an endomorphism bundle (which is zeroth-order differential operator - no derivatives involved!) that requires a kind of non-functorial 'trick'.
You almost said it, but I'll say it here: the set if connections on a vector bundle is a torsor for the vector space of sections of .
Thus, as soon as we pick one connection and call it 'zero', we can identify all connections with sections of . But this choice is arbitrary.
Whenever you have a torsor for an abelian group (e.g. a vector space as we're dealing with now), you can define differences of points in this torsor, which are elements of . You get an operation
as well as the action of on
and of course the addition in :
and these obey all the rules you might expect for addition and subtraction.
Similarly, the set of connections on a principal bundle is a torsor for the vector space of sections of . Choosing any one connection and calling it 'zero', we can then identify all connections on with sections of .
John Onstead said:
There's the forgetful functor U: GRep -> TopVect that sends a G-representation of a topological vector space to its underlying topological vector space. If this functor has an adjoint, a free functor of some sort, this will automatically send the morphism Lie(G) -> End(V) in TopVect to our G-equivariant morphism in GRep Ad -> p'.
I think you're on the right track, but you need to tweak this to get it to work. To get a nice adjunction I don't think we should fix a group G ahead of time.
To reduce clutter and complication let's use Vect to mean the category of finite-dimensional vector spaces. Let Rep(G) be the category where an object is a a finite-dimensional representation of G on some vector space,
I believe we can use the Grothendieck construction to combine all these categories Rep(G) into a single category Rep where an object is a Lie group G together with a finite-dimensional representation . I think there's a fibration
Rep LieGrp
such that the fiber over the Lie group G is Rep(G).
If so, there should be a functor
U: Rep Vect
sending each object G GL(V) to its underlying vector space V.
Conjecture. U has a left adjoint
F: Vect Rep
sending each vector space V to the object 1: GL(V) GL(V).
That is, this left adjoint seeks the universal Lie group with a representation on V, and this is GL(V), with the identity map as its representation.
This doesn't do everything you want, at least not yet, but it's supposed to be a warmup for understanding how the 'frame bundle' construction of a principal bundle from a vector bundle is something like a left adjoint to the 'associated bundle' construction of vector bundles from a principal G-bundle and representations of G.
(That doesn't parse yet as I've stated it, hence 'something like'.)
I did not attempt to get Lie algebras into the game. I believe there's an adjunction between Lie groups and Lie algebras, too.
John Baez said:
What does that mean? We certainly don't need to choose a trivialization of a principal bundle to define its associated bundles. And we certainly don't need a trivialization of a principal bundle or a vector bundle to define connections on such a bundle.
The associated bundle is the coequalizer of P x G x F -> P x F, where the two morphisms are the right action of G on P, and the left action of G on F. This gives (p, g, f) -> (pg, f) and (p, g, f) -> (p, r(g) f) respectively. In order for this to make sense, p has to be a group element, not a torsor element.
Though this does remind me of another question. The usual way to write the equivalence relation is not (pg, f) ~ (p, r(g) f) but rather (p, f) ~ (pg, r(g^-1) f). However, I'm struggling with two things here- first, how do I get from one to the other? Even if I make the substitution p <-> pg, this only gets me to (p, f) ~ (pg, r(g) f), without the -1. That leads me to the second point- what is the rationale for the "-1"? It seems the relation (p, f) ~ (pg, r(g) f) should do exactly the same thing given every group element has a unique inverse. The goal with the equivalence, after all, is to "collapse" all elements g into a single element since we wish to "divide by G". (p, f) ~ (pg, r(g) f) gets the job done since no matter which g you select, it is all made equivalent to (p, f), which indeed does achieve what we set out to do. So why add in the "-1" to the g if it's completely unnecessary to do so?
The associated bundle is the coequalizer of P x G x F -> P x F, where the two morphisms are the right action of G on P, and the left action of G on F. This gives (p, g, f) -> (pg, f) and (p, g, f) -> (p, r(g) f) respectively. In order for this to make sense, p has to be a group element, not a torsor element.
No, it only needs to be a torsor element! When p is an element of a right G-torsor, and g is an element of G, then pg makes sense: it's the result of acting on p by g, which is another element of our right G-torsor.
So, there's definitely no need to restrict attention to trivial principal bundles, when forming an associated bundle! That restriction would kill off most of the interesting examples. The whole theory would become a pale shadow of its former self.
Btw, the above formula is why we traditionally demand the fibers of a principal bundle be right G-torsors. It's a completely arbitrary choice, but people like this formula.
John Baez said:
Btw, the above formula is why we traditionally demand the fibers of a principal bundle be right G-torsors. It's a completely arbitrary choice, but people like this formula.
Still, I'd like to understand how to get from (pg, f) ~ (p, r(g) f) to (p, f) ~ (pg, r(g^-1) f). The latter form seems more useful since it seems more obvious that it "collapses" all the terms in g to a single element in the quotient.
Let us compute! Assume
for all g. Then
Conversely, given
for all , a similar computation shows
I might not be very familiar with the "rules of equivalence relations" so I'm having a little bit of a hard time following, but here's my best idea. It looks like to me you first started with (p, f) ~ (p, f). Then, p = p idG = p (g g^-1), and so (p, f) ~ (p (g g^-1), f). You then rearrange (I think this still works even if the group is noncommutative due to the properties of the inverse) this into (p, f) ~ ((pg) g^-1, f). If we substitute q = pg and h = g^-1, we get ((pg) g^-1, f) ~ (qh, f). But since q and h are elements of P and G respectively, we can use the relation (pg, f) ~ (p, r(g) f) and get (qh, f) ~ (q, r(h) f). We can then reverse the substitution q = pg and h = g^-1 to get what we want: (pg, r(g^-1) f).
Yes, and equivalence relations are transitive so we can string together a bunch of ~s just as if they were equations.
The old "replace 1 with " is ubiquitous in group theory so I knew we'd need it to derive an expression with a and a in it from one which had neither.
John Baez said:
The old "replace 1 with gg−1" is ubiquitous in group theory so I knew we'd need it to derive an expression with a g and a g−1 in it from one which had neither.
Neat trick, I'll keep this in mind for my next group theory exercise!
It's been really interesting learning about the abstract mathematics behind gauge theory so far. But believe it or not, one of the main reasons why I wanted to learn gauge theory is so I can use it in a practical sense as a sort of "template" to turn ideas about physics into actual mathematically rigorous models. So I'd like to turn the conversation more in that direction so I can eventually build up this "template".
To start with, up until now we've gone through a lot of the abstract mathematical procedures for defining a gauge theory. But I'm wondering how things work from the more practical, concrete physicist's perspective- that is, what is the parallel of all the things we've covered so far from the physicist POV? The fact your book does not even discuss principal bundles leads me to believe that physicists don't talk about them much either. If that's the case, then how do physicists approach building up a gauge theory? For instance, say a physicist was handed a symmetry and a representation and asked to use that to construct a gauge theory. What would they do? Would they write down the curvature tensor first, or would they do something with the connection first and then use the covariant derivative formula to compute the tensor (and if so, how would they express/derive the connection)? And as an example, given U(1) symmetry, how would a physicist arrive at deriving both the gauge potential and Fuv the curvature tensor? (also a minor thought, but don't you think the formula for curvature in terms of the covariant derivative is a little weird? Since you are using the covariant derivative- which comes from the connection- to take the derivative of the connection 1-form it is defined from itself!)
The fact your book does not even discuss principal bundles leads me to believe that physicists don't talk about them much either.
Fancy-ass mathematical physicists like me talk about them a lot. But Yang and Mills didn't even know what they were when they invented Yang-Mills theory. (Later Yang had an office near Chern at the Institute for Advanced Studies, and they were both shocked that the other one was studying similar ideas.) And when I took courses on quantum field theory the focus was on computing scattering amplitudes, computing how coupling constants run, understanding the Higgs mechanism, and stuff like that. Too much to do, too little time, so no talk of bundles. Remember, over all bundles are trivializable! So you don't really even need to mention bundles at all - not even vector bundles.
For instance, say a physicist was handed a symmetry and a representation and asked to use that to construct a gauge theory. What would they do?
The first thing in field theory is to choose your fields, and the second is to choose your Lagrangian. If you're doing a gauge theory you have to choose a Lie group , and choose at least one of your fields to be a -valued 1-form for = Lie(G). That's called a gauge fields. Then you'd choose some "matter fields", which are typically scalar or spinor fields valued in various representations of . Then you want to choose a Lagrangian, which is a function of your gauge fields and matter fields. You know how gauge transformations act on your gauge fields and matter fields (you've learned this in class), and you demand that gauge transformation leave the Lagrangian invariant.
And as an example, given U(1) symmetry, how would a physicist arrive at deriving both the gauge potential and Fuv the curvature tensor?
Well, realistically speaking, since this case gives Maxwell's equations coupled to matter fields, they wouldn't derive anything: since they'd studied electromagnetism as an undergrad, they'd already know how the electromagnetic field tensor depends on the electromagnetic vector potential , so they'd just automatically use that formula:
(This is how physicists would say: a connection on a trivial U(1) bundle is the same as a 1-form , and its curvature is the same as a 2-form .)
You could, of course, question them about where this formula comes from and why it's good! Different physicists would give different answers. Fancy ones might start talking about U(1) connections.
John Baez said:
Remember, over Rn all bundles are trivializable! So you don't really even need to mention bundles at all - not even vector bundles.
I suppose not, but I see bundle theory as a very general organizing principle for mathematical physics, that acts as a good common framework to situate everything on. In that sense, to me at least, the relation between bundle theory and mathematical physics is very analogous to the relation between category theory and math in general.
John Baez said:
The first thing in field theory is to choose your fields, and the second is to choose your Lagrangian.
Ah, so gauge theory is heavily intertwined with Lagrangian mechanics! I suppose that makes sense given how Lagrangian mechanics is the setting where the connection between symmetry and physics becomes most apparent, and is the setting in which Noether's theorem is most apparent. I do want to circle back to Lagrangians in more detail later, so this overview is very helpful!
John Baez said:
tensor Fμν depends on the electromagnetic vector potential Aμ
What is the electromagnetic vector potential? Isn't the potential supposed to be a 1-form, not a vector field?
John Onstead said:
John Baez said:
Remember, over Rn all bundles are trivializable! So you don't really even need to mention bundles at all - not even vector bundles.
I suppose not, but I see bundle theory as a very general organizing principle for mathematical physics, that acts as a good common framework to situate everything on.
I agree - that's why I wrote a book about this! But you were asking me how physicists think about gauge theory. A typical introduction to gauge fields in a quantum field theory doesn't mention bundles, because they can do the key calculations without them.
John Onstead said:
John Baez said:
The first thing in field theory is to choose your fields, and the second is to choose your Lagrangian.
Ah, so gauge theory is heavily intertwined with Lagrangian mechanics!
I'd put it this way: modern physics derives the equations of motion starting from a Lagrangian, so any sort of field theory in physics these days is obtained starting from a Lagrangian. A gauge field theory is obtained starting from a bunch of fields that transform under gauge transforms, and a gauge-invariant Lagrangian.
My book shows how this works for Yang-Mills theory, Chern-Simons theory, and BF theory.
John Baez said:
the electromagnetic field tensor depends on the electromagnetic vector potential
What is the electromagnetic vector potential? Isn't the potential supposed to be a 1-form, not a vector field?
Yeah, a U(1) connection on a trivial bundle is the same as a 1-form, and I'd call that 1-form . In physics notation is this 1-form, and is the corresponding vector field. But this thing got called the "vector potential" in the late 1800s or early 1900s, before anyone knew about the difference between vector fields and 1-forms. And this is the term physicists use.
Hmm, I guess in 4d spacetime (as opposed to 3d space) Wikipedia calls it the 'electromagnetic four-potential. Its 3 space components are commonly called the 'vector potential'. These articles are good to read to see how physicists think about electromagnetism.
John Baez said:
Yeah, a U(1) connection on a trivial bundle is the same as a 1-form, and I'd call that 1-form A. In physics notation Aμ is this 1-form, and Aμ is the corresponding vector field
How exactly does this work? I know there's something called the "musical isomorphism" that works if your base space is a Riemannian manifold (not sure how well it works for a general smooth manifold). This allows you to transform a vector in TM to a covector in T* M and vice versa. But my problem is that the 1 form A is not just a section of T* M- it's a section of End(E) x T* M. So what do you have to do? Do you have to "forget" the End(E) to get a section of just T* M, and then take the musical isomorphism to get a section of TM?
I'll answer this in the top-down way you like.
Vector bundles over a chosen base form a "2-rig" - a Cauchy complete symmetric monoidal linear category. The tensor product here is the fiberwise tensor product.
Thus, given two maps of vector bundles , we get a map .
In particular, given a map of vector bundles and a vector bundle , we get a map of vector bundles .
The metric on a Riemannian or semiRiemannian manifold turns tangent vectors into cotangent vectors by and this gives a map of vector bundles
It follows that for any other vector bundle , we get a map of vector bundles
Voila!
John Baez said:
I'll answer this in the top-down way you like.
Thanks! This answer was very helpful!
John Baez said:
It follows that for any other vector bundle E, we get a map of vector bundles
(1E⊗♯):E⊗TM→E⊗T∗M
That's interesting! (by the way I think you used the sharp symbol, but you meant the flat symbol, right?) So I assume there's also a map (idE x sharp): E x T* M -> E x TM, and thus End(E) x T* M -> End(E) x TM. But now I'm confused about what a section of End(E) x TM "means". It makes sense for a 1-form to be valued in End(E), since we get an endomorphism through the process of evaluating the action of the 1-form on a vector field. But it doesn't seem to make much intuitive sense that a vector has a "value" in End(E), since vectors don't act on anything. In addition, the electromagnetic 4-potential has real numbers as its values as we would expect of a section of just TM, no mention of endomorphisms of something.
In the context of gauge field theory it's not useful to take an End(E)-valued 1-form and turn it into an End(E)-valued vector field: you basically shouldn't do it. It's just a thing you can do when you have a semi-Riemannian metric around: part of the very useful game called 'index juggling'.
As I said, calling the vector potential the 'vector' potential is a leftover from when people didn't care about the difference between vector fields and 1-forms. It's easy not to notice the difference when you think you live in Euclidean which seems to have a god-given Riemannian metric.
This myth was damaged by special relativity and destroyed by general relativity, where we learned that the metric tensor is a dynamical field just like the electromagnetic field.
I still think the vector potential is useful since in physics many systems have "potential landscapes" that allow you to more easily visualize and comprehend its dynamics. Generally, the "potential landscape" is a scalar field, and can easily be read off from a vector 4-potential by taking its first term (like how energy can be read off as the first term of the 4-momentum). This means vector potentials still have relevance- that is, unless there's a way to extract a scalar potential directly from a potential 1-form (IE, extract the electric potential from the electromagnetic 1-form). That would be an interesting computation to see!
The electromagnetic 1-form on Minkowski spacetime, , is
where in old-fashioned pre-special-relativity physics is called the 'scalar potential' and is called the 'vector potential'.
Lots of things that look like scalars in pre-relativistic physics turn out to be the time component of a vector or 1-form in relativity. The scalar potential is one; time is another; energy is another.
Ah, that makes sense!
But I'm still confused about something. Why are phi, Ax, Ay, and Az all real scalar fields? Shouldn't they be matrices since the connection 1-form is an End(E) valued differential form?
We're talking electromagnetism on Minkowski spacetime, where End(E)-valued differential forms are just differential forms since E is a 1-dimensional trivialized bundle!
So is a 1-form on Minkowski spacetime, and .
Read the stuff in my book on electromagnetism and differential forms - the first chapter.
Nature was very kind to us in making the force most easy to produce - the electromagnetic force - be described by a U(1) gauge theory, on a trivial principal U(1)-bundle, so we could learn about vector calculus and then differential forms before we were pressed into learning about Lie-algebra-valued differential forms and finally connections on nontrivial bundles!
John Baez said:
We're talking electromagnetism on Minkowski spacetime, where End(E)-valued differential forms are just differential forms since E is a 1-dimensional trivialized bundle!
Sure, but what if we were doing quantum electrodynamics instead and therefore wanted our fibers of E to be spinor spaces? Surely the endomorphism bundle of the spinor bundle E is a lot more complicated than just consisting of real numbers.
John Baez said:
Nature was very kind to us in making the force most easy to produce - the electromagnetic force - be described by a U(1) gauge theory, on a trivial principal U(1)-bundle, so we could learn about vector calculus and then differential forms before we were pressed into learning about Lie-algebra-valued differential forms and finally connections on nontrivial bundles!
Maybe it's due to the simplicity of the EM force that we notice it much more than the other forces, since some of the complications of the other forces are the very things that confine their actions and effects to the quantum scale.
John Onstead said:
John Baez said:
We're talking electromagnetism on Minkowski spacetime, where End(E)-valued differential forms are just differential forms since E is a 1-dimensional trivialized bundle!
Sure, but what if we were doing quantum electrodynamics instead and therefore wanted our fibers of E to be spinor spaces? Surely the endomorphism bundle of the spinor bundle E is a lot more complicated than just consisting of real numbers.
When dealing with charged particles of any sort, the fiber V will be a representation of U(1) and thus a representation of the Lie algebra Lie(U(1)). In this case we use that Lie algebra representation
to map our Lie(U(1))-valued 1-form (which is just a fancy way of talking about a 1-form, since ) to an End(V)-valued 1-form.
This is the trivial bundle case of the following: given a principal U(1) bundle P M, and a representation of U(1) on some vector space V, we can form an associated bundle E M with fiber V. Then, any connection on P gives a connection on E.
And that works for any Lie group G, not just U(1):
Given a principal G-bundle P M, and a representation of G on some vector space V, we can form an associated bundle E M with fiber V. Then, any connection on P gives a connection on E.
John Baez said:
When dealing with charged particles of any sort, the fiber V will be a representation of U(1) and thus a representation of the Lie algebra Lie(U(1)). In this case we use that Lie algebra representation
Lie(U(1))→End(V)
to map our Lie(U(1))-valued 1-form A (which is just a fancy way of talking about a 1-form, since Lie(U(1))≅R) to an End(V)-valued 1-form.
Ok, I think I see, this is what we were talking about a day or so ago. If I'm understanding correctly, it seems that you can make V and thus End(V) as complicated as you want- End(V) might even contain the most complex matrices you've ever seen in your life. But no matter how complicated it gets, the image of the morphism Lie(U(1)) -> End(V) will always land in the real numbers and never on any matrix outside of the real numbers. Therefore, the induced connection will, likewise, only take on real values. In a sense, there's constraint by the homomorphism Lie(U(1)) -> End(V), preventing you from picking anything outside the real numbers in End(V) to be a value of the connection 1-form.
But no matter how complicated it gets, the image of the morphism Lie(U(1)) -> End(V) will always land in the real numbers and never on any matrix outside of the real numbers.
If V describes a particle of some specific charge, that's the kind of morphism Lie(U(1)) -> End(V) that you'll choose to use. There are others, but you won't use those.
Whenever V is a complex vector space, there's a god-given Lie group homomorphism
: U(1) GL(V)
sending each unit complex number U(1) to : V V.
More generally for each integer , called the charge, there's a Lie group homomorphism
: U(1) GL(V)
sending to .
When we differentiate this we get a Lie algebra homomorphism
Lie(): Lie(U(1)) Lie(GL(V)) End(V)
I said Lie(U(1)) , but it's better to think of it as the imaginary numbers since those are what you exponentiate to get elements of U(1).
So, we can say what Lie( does by saying what it sends the number i to, and it sends it to
End(V)
So, I'd change what you said slightly by saying our Lie algebra homomorphism always lands in the imaginary numbers (times the identity).
But your main point is exactly right: End(V) may be big and complicated, but our Lie algebra homomorphism lands in a very small piece: the scalar multiples of the identity. By coincidence (?) this is precisely the center of End(V): the guys that commute with all others.
It's interesting you bring up charge because that's kind of what I wanted to get into next. I know your book covers it, but I'm still having trouble parsing exactly what the "charge" is supposed to be. In many introductions to the subject it seems to be just some constant thrown into the equations because you can, but this is hardly satisfying. I want a reason for there to be a charge in the equations! I guess the first thing I'd want to reduce my confusion on is the relationship between a section of the associated bundle and the notion of charge (or charge density), which itself is derived from the 4-current J. Is J actually just identical to a section of the associated bundle? You seem to be strongly implying this above, so I just wanted to clarify!
No, quite the contrary.
We were talking about a charged field, like the electron field, which is a spinor field on spacetime. The 4-current is an ordinary 1-form
There is a formula to compute from , but they're not the same thing:
where is the charge of your spinor field, and are 4 matrices called the gamma matrices. This is the quick and dirty physics explanation, not the math explanation.
John Baez said:
So, I'd change what you said slightly by saying our Lie algebra homomorphism always lands in the imaginary numbers (times the identity).
I was going over this again and I do think it makes sense to consider the elements of Lie(U(1)) as imaginary numbers, since, as you mentioned, that's what you need to take the exponentials of. But I think the homomorphism Lie(U(1)) -> End(V) should still map into the real numbers in End(V) (maybe this homomorphism is like a multiplication by negative i?), since otherwise we'd expect the electromagnetic 4-potential to be imaginary valued, and this certainly is not the case!
John Baez said:
No, quite the contrary.
Just when I thought I had finished learning all the fields at play in a given gauge theory!
John Baez said:
We were talking about a charged field, like the electron field, which is a spinor field ψ on spacetime.
So a section of the associated bundle for electromagnetism is the spinor field that actually describes the electrons, I guess that makes sense. I'm excited to get into why this means electrons can "see" the EM field! But before that, I wanted to know about the non-quantum case. When we go back to the associated bundle being trivial as you were mentioning (in the classical relativistic case, where we aren't using spinors), then what exactly are the sections of this bundle supposed to "mean" physically? Is it like we are moving from the Dirac equation (spin 1/2) to the Klein Gordon equation (spin 0)? But if so, isn't the Klein Gordon still technically a wave equation- but we left QM so we aren't viewing particles as waves anymore, so what exactly is this supposed to describe then? (I guess this question is more of a physics than math one)
John Onstead said:
John Baez said:
So, I'd change what you said slightly by saying our Lie algebra homomorphism always lands in the imaginary numbers (times the identity).
I was going over this again and I do think it makes sense to consider the elements of Lie(U(1)) as imaginary numbers, since, as you mentioned, that's what you need to take the exponentials of. But I think the homomorphism Lie(U(1)) -> End(V) should still map into the real numbers in End(V) (maybe this homomorphism is like a multiplication by negative i?), since otherwise we'd expect the electromagnetic 4-potential to be imaginary valued, and this certainly is not the case!
I hadn't wanted to get into this earlier, because I was getting tired. But you're right. Physicists indeed use multiplication by to map Lie(U(1)) to , allowing them to treat the 4-potential as real-valued.
Or perhaps they use multiplication by - only god can discern the difference! :upside_down:
More generally, the Lie algebra Lie(U(N)) consists of skew-adjoint N N complex matrices, i.e. matrices with
These are the matrices with U(N) for all . But physicists prefer self-adjoint matrices so they claim the Lie algebra Lie(U(N)) consists of self-adjoint N N matrices. To do this they are secretly working, not with T, but with iT or -iT.
This is one of the many things you need to keep in mind when translating between math papers and physics papers.
For example, mathematicians use the Lie bracket
but physicists working with their funny description of Lie(U(N)) get punished for their sin by needing to use the Lie bracket
Either choice of sign would work, but you need to figure out which one they are using!
Luckily the bracket in Lie(U(1)) vanishes so we are spared thinking about this in electromagnetism!
@John Onstead wrote:
So a section of the associated bundle for electromagnetism is the spinor field that actually describes the electrons, I guess that makes sense.
:check_mark:
I'm excited to get into why this means electrons can "see" the EM field!
To do this we may need to solve the Dirac equation for a spinor field coupled to the electromagnetic field! Or maybe just write down the equation and ponder it a bit.
But before that, I wanted to know about the non-quantum case. When we go back to the associated bundle being trivial as you were mentioning (in the classical relativistic case, where we aren't using spinors), then what exactly are the sections of this bundle supposed to "mean" physically? Is it like we are moving from the Dirac equation (spin 1/2) to the Klein Gordon equation (spin 0)?
It's a bit strange that you seem to be using "classical relativistic" to mean "not using spinors". There are 3 separate switches you can flip here: classical/quantum, relativistic/nonrelativistic, and spin-1/2 / spin-0. Furthermore, in the classical case you have a choice between working with particles or fields (while in the quantum case these are unified). So there are 4 separate quantum theories to discuss here, and 8 separate classical ones, for a total of 12. Some of these are much more commonly discussed than others, but they all exist. Then there are the relations between these various theories.
But if so, isn't the Klein Gordon still technically a wave equation- but we left QM so we aren't viewing particles as waves anymore, so what exactly is this supposed to describe then? (I guess this question is more of a physics than math one)
This theory describes classical, relativistic, spin-0 fields - not particles. Classical field theory is a perfectly respectable subject, though the most important case is the classical relativistic spin-1 field called the electromagnetic field.
John Baez said:
There are 3 separate switches you can flip here: classical/quantum, relativistic/nonrelativistic, and spin-1/2 / spin-0. Furthermore, in the classical case you have a choice between working with particles or fields (while in the quantum case these are unified). So there are 4 separate quantum theories to discuss here, and 8 separate classical ones, for a total of 12. Some of these are much more commonly discussed than others, but they all exist.
Actually I take it back: you can and should study particles and fields separately even in quantum mechanics. So I believe there are at least 16 theories of charged matter interacting with electromagnetism: particles/fields, classical/quantum, relativistic/nonrelativistic, and spin-1/2 / spin-0.
John Baez said:
So I believe there are at least 16 theories of charged matter interacting with electromagnetism: particles/fields, classical/quantum, relativistic/nonrelativistic, and spin-1/2 / spin-0.
That's quite confusing! I guess I saw spin as a quantum property, so "turning off" quantum to me meant also getting rid of the spin. So I guess my original question pertains to what exactly is being described by a spin-0 classical relativistic field (not particle) that interacts with electromagnetism. Though, that does make me curious about what a spin-1/2 classical (not quantum) relativistic field represents. How can you describe a quantum thing in a non-quantum way? I guess I'm confused about this because you only ever see references to the sections of the associated bundle in the context of quantum field theory when these represent the quantum fields of fermions (or I guess in relativistic quantum mechanics when we are focusing on individual particles), so I'm wondering what significance this concept has outside of QFT and why this isn't discussed more often.
Spin is not inherently quantum, but it's sadly neglected in classical physics. It's just that by the time people get interested in particles or fields with their own intrinsic angular momentum - that is, 'spin' - they are usually interested in treating them quantum-mechanically. You can read lots of papers about classical fields or classical particles with spin, but these are mostly journal articles, not textbooks.
However, a nice exception is that the classical electromagnetic field has spin 1: this is a classical field that people have studied closely enough to spend a lot of time thinking about its intrinsic angular momentum. Even if they don't call it spin, that's what it is.
This is a big subject, so for starters I'll just say that both in classical and quantum mechanics, 'spin' refers to the way an object (say a particle or field) transforms under the rotation group SO(3) - which, when you get to special relativity, is seen as a subgroup of the Lorentz group SO(3,1).
In classical field theory, we are often interested in 'fields' that are functions
where is spacetime, let's say for concreteness, and is some vector space that's a representation of SO(3) (if we're doing pre-relativistic physics) or SO(3,1) (if we're doing relativistic physics) or perhaps the universal covers of these groups.
We are, in the end, mostly interested in field theories that have SO(3) or SO(3,1) acting as a symmetry group.
Thus, to make progress in field theory, you need to understand the finite-dimensional representations of these groups. Luckily they are completely classified. And one of the major aspects of the classification is spin.
For example, the finite-dimensional irreducible representations of SO(3) are classified by a natural number called the spin: the spin-s representation is the unique-up-to-iso irreducible representation of SO(3) of dimension 2s+1.
Every finite-dimensional representation is a direct sum of irreducible ones, so we usually focus on the irreducible representations, at least when we're getting started.
John Baez said:
We are, in the end, mostly interested in field theories that have SO(3) or SO(3,1) acting as a symmetry group.
This is interesting. So when talking about spin, we should focus less on U(1) and more on SO(3) and SO(3, 1). So does this mean that, given the principal SO(3)-bundle, the associated vector bundles corresponding to/built from the irreducible representations of SO(3) are the spinor bundles? If so, then why were we talking about spinor bundles in the context of U(1)- U(1) and SO(3) are two completely different symmetries! But if spinor bundles are associated bundles to SO(3) and not U(1), then how are we supposed to work with spinors in EM? Also, I recall reading something about spin groups like Spin(3). Are spinor bundles the associated bundles to Spin(3), or to SO(3)?
If so, then why were we talking about spinor bundles in the context of U(1)?
You may recall were talking about U(1) because that's the group that governs electromagnetism. Then for some reason you started talking about spinors. You asked about the electric charge of spinors. In reply I explained how quite generally given any vector space V, different charges describe different representations of U(1) on that vector space:
More generally for each integer , called the charge, there's a Lie group homomorphism
: U(1) GL(V)
sending to .
Now you're asking more about spinors.
Are spinor bundles the associated bundles to Spin(3), or to SO(3)?
They are associated not to SO(3) or SO(3,1) but to their double covers Spin(3) and Spin(3,1). I was talking about SO(3) and SO(3,1) as a warmup. For example, I said that for each integer spin s there is an representation of SO(3) called the spin-s representation. When we go to the double cover Spin(3) we also get spin-s representations for half-integer spins.
However note that if Spin(3) or Spin(3,1) has a representation on some complex vector space V, we can also make V into a representation of U(1), using the above formula!
This is how spinors get to have charge as well as spin.
John Baez said:
They are associated not to SO(3) or SO(3,1) but to their double covers Spin(3) and Spin(3,1). I was talking about SO(3) and SO(3,1) as a warmup. For example, I said that for each integer spin s there is an representation of SO(3) called the spin-s representation. When we go to the double cover Spin(3) we also get spin-s representations for half-integer spins.
However note that if Spin(3) or Spin(3,1) has a representation on some complex vector space V, we can also make V into a representation of U(1), using the above formula!
This is how spinors get to have charge as well as spin.
Thanks, this really puts it all together!
Great!
John Baez said:
However, a nice exception is that the classical electromagnetic field has spin 1: this is a classical field that people have studied closely enough to spend a lot of time thinking about its intrinsic angular momentum. Even if they don't call it spin, that's what it is.
I almost forgot that the EM field itself has a spin! But what's really twisting my mind is that the EM field isn't a section of some associated bundle, it's a connection 1-form on an associated bundle. So is this implying that a connection 1-form can have a spin as well? If so then does that mean somehow End(E) x T* M is an associated bundle (or can be made into one) for the spin 1 representation of the Spin(3) group? And if that's so, then shouldn't the status of the EM field being a spin 1 field depend on what we choose to be our associated U(1) bundle E?
I believe that among the various choices we have decided to talk about classical/quantum, nonrelativistic/special relativistic/general-relativistic, particles/fields. In this case we're studying fields on Minkowski spacetime and hoping that Spin(3,1) acts as symmetries of everything we study.
Since Minkowski spacetime is contractible and every fiber bundle over a contractible manifold is trivializable, everybody working on special relativistic field theory assumes their fiber bundles have been trivialized.
This partially answers your question:
And if that's so, then shouldn't the status of the EM field being a spin 1 field depend on what we choose to be our associated U(1) bundle E?
There is no real choice of principal U(1) bundles over Minkowski spacetime (): they are all isomorphic. So we should just use the standard one, P = U(1).
However, we still must realize that in the end, all our physical predictions should be gauge-invariant: unchanged by automorphisms of P, which are called gauge transformations. So while we can work using the obvious trivialization of P, for convenience of calculations, we still need to check that our final physical predictions will be unchanged if we apply a gauge transformation (thus changing the trivialization).
At least for me, it takes a while to work through this to the point of showing that the electromagnetic field has spin 1.
One method that works in the case of electromagnetism not yet coupled to matter (the simplest case) is this. We:
There may be some quicker procedure, but this rather long process is good for understanding classical electromagnetism and absolutely essential for understanding quantum electromagnetism and QED... so this is what immediately comes to mind.
I guess I'm seeing glimpses of a way to speed this up....
John Baez said:
There is no real choice of principal U(1) bundles over Minkowski spacetime (R4): they are all isomorphic. So we should just use the standard one, P = R4× U(1).
I understand, I was talking about associated vector bundles to the principal U(1) bundle.
John Baez said:
There may be some quicker procedure, but this rather long process is good for understanding classical electromagnetism and absolutely essential for understanding quantum electromagnetism and QED... so this is what immediately comes to mind.
This is certainly interesting, but I'm wondering if there's some sort of a "lifting" one can do. For instance, given a complex spin representation Spin(3) -> GL(V), is there any way to "lift" this to a spin representation Spin(3) -> GL(End(V))? For instance with EM, this would allow us to take the spin 1/2 representation (that can also double as a U(1) representation as mentioned above) and use that to make the endomorphism bundle of the associated bundle also a representation, but this time of spin 1. This would automatically make the connection a section of a bundle associated to a spin 1 representation and thus a spin 1 field.
John Onstead said:
John Baez said:
There is no real choice of principal U(1) bundles over Minkowski spacetime (): they are all isomorphic. So we should just use the standard one, P = $$\mathbb{R]^4 \times$$ U(1).
I understand, I was talking about associated vector bundles to the principal U(1) bundle.
Okay, then this may help:
Say we are given any principal U(1)-bundle over any manifold.
Up to isomorphism there's one 1-dimensional vector bundle associated to for each integer - called the 'charge' - since every 1-dimensional representation of U(1) is given by the formula I gave involving charge.
Every higher-dimensional vector bundle associated to is a direct sum of 1-dimensional ones - since every higher-dimensional representation of U(1) is a direct sum of 1-dimensional ones.
These facts about representations of U(1) are a bit hard to prove without a more general study of representation theory.
The next thing to think about in gauge theory will be equations. But before getting into any specific equation, IE Yang-Mills, I want to understand what, fundamentally, "IS" an equation, specifically a PDE. Generally, in Set, an equation is given by a pullback square- the square is a commuting diagram (which is where the equality in equation comes from) and the pullback object itself is the solution set of the equation. When it comes to a partial differential equation, where is the corresponding pullback being taken (IE, which category is the pullback taking place), and what is this pullback being taken of?
Relatedly, I also want to learn more about jet bundles but I haven't found many good resources on them. Would you be able to either point me in the direction of a good source, or give some general basis/overview of what a jet bundle is supposed to be doing and how it deals with differential equations?
The "jet" of a smooth function at a point keeps track of its first derivatives - it looks like this:
There's a bundle over whose fiber at is the set of possible jets at . This bundle is fairly boring. But we can generalize it. For any smooth fiber bundle we can define a new bundle called the nth jet bundle of , whose fiber at keeps track of all possible derivatives of sections of at , up to the nth derivative.
The concept of derivatives, and higher derivatives, of a section of a bundle over a manifold requires some thought.
As usual, it's explained on Wikipedia:
though it probably pays to click on their link to the article "Jet (mathematics)".
John Baez said:
The concept of derivatives, and higher derivatives, of a section of a bundle over a manifold requires some thought.
It certainly does! For instance, how can you even define the notion of a derivative of a section of a bundle over a manifold without a connection? The covariant derivative is the only way to keep track of the curving of the manifold when differentiating sections of bundles! And yet it seems the jet bundle works for any arbitrary fiber bundle, so it can do just fine without covariant derivatives.
Also, I don't understand how one can take a coordinate independent partial derivative. By definition, a partial derivative requires specifying a basis (in order to define a notion of an "x direction" or "y direction"), since it is a directional derivative in the direction of one of the basis vectors.
"Derivative" doesn't necessarily mean "directional derivative", e.g. the differential of a function is a 1-form, easily defined without using coordinates, that nonetheless captures all the information in all the partial derivatives for all coordinate systems. is thus a very clean concept of "first derivative" for a function on a manifold.
So we want to take a similar approach for higher derivatives of functions on a manifold, and then extend it to sections of general bundles over a manifold.
John Baez said:
"Derivative" doesn't necessarily mean "directional derivative", e.g. the differential df of a function is a 1-form, easily defined without using coordinates, that nonetheless captures all the information in all the partial derivatives ∂f/∂xi for all coordinate systems. df is thus a very clean concept of "first derivative" for a function on a manifold.
That makes sense; if I am remembering correctly the directional derivative in a direction given by some vector is actually given by acting the differential on the vector. The differential, when specifying functions R^n -> R^m and choosing the standard basis on Euclidean space, gives the Jacobian matrix. When m = 1, the Jacobian matrix has a single row and so reduces to a 1-form. More generally, any function valued in R from any manifold will be a 1-form.
John Baez said:
So we want to take a similar approach for higher derivatives of functions on a manifold, and then extend it to sections of general bundles over a manifold.
But how exactly does the differential relate to the jet bundle?
The 1-jet of a smooth function records its 0th and 1st derivatives, so at each point it's just the pair consisting of and , so in this case the 1-jet bundle is just where is the trivial line bundle over .
Things get a lot more interesting and subtle when we go to the 2-jet bundle; see the Wikipedia article. You'll notice they define the n-jet at a point to be an equivalence class of germs (though they don't say the word 'germ', except here).
John Baez said:
Things get a lot more interesting and subtle when we go to the 2-jet bundle; see the Wikipedia article. You'll notice they define the n-jet at a point to be an equivalence class of germs (though they don't say the word 'germ', except here).
This is a really interesting connection now that I've thought about it for a bit, but I want to make sure I'm remembering things from the topos blog discussion correctly. From what I can recall, given a fiber bundle, you can construct its sheaf of sections, which assigns to every open subset of the base space the set of all local sections of the fiber bundle for that open subset. You can then define the concept of a "stalk" of the sheaf to be what the sheaf "looks like" over a single point in the base space. The elements of the stalk are the "germs" of the sections of the bundle that describe the local behavior of a section right at a single point. In a sense, there's a very strong analogy between jet spaces and stalks, and between jets and germs (and maybe this carries into some correspondence between jet bundles and etale spaces?) The main difference seems to be that jets are "truncated" in a certain sense while germs are not. My guess is this is where the equivalence relation comes in, but I'm not sure why the equivalence relation is the way it is.
So my questions are: is my understanding (about germs, stalks, and the connection to jets) correct? Is an infinity prolongated jet space (so an untruncated jet space) "identical" to the stalk of the corresponding sheaf of sections? Why is the equivalence relation on germs the way it is? And lastly, does this work for any smooth bundle- given any smooth bundle p: E -> B, is finding a jet space at a point for its sections really as easy as finding its sheaf of sections, doing the stalk/germ construction, and then doing the equivalence relation?
John Onstead said:
The main difference seems to be that jets are "truncated" in a certain sense while germs are not. My guess is this is where the equivalence relation comes in, but I'm not sure why the equivalence relation is the way it is.
You're definitely on the right track.
We start with the sheaf of smooth sections of some fiber bundle . Then an n-jet at a point is an equivalence class of germs at .
What's the equivalence relation?
The germs of two different sections are equivalent iff those sections have the same value, first derivative, second derivative, ... and nth derivative at .
The only technicality shows up in defining precisely what we mean by "the same value, first derivative, second derivative, ... and nth derivative at ".
In the passage I pointed you to, Wikipedia did the case where is the trivial line bundle . Then a section defined on an open set is just a fancy way of talking about a smooth real-valued function .
There are lots of ways to say when two functions have the same value, first derivative, second derivative... and nth derivative at .
One is just to say it:
for all and , and all local coordinate systems in a neighborhood of .
There, didn't seeing all those indices make you feel good? It makes me feel like a real mathematician. :stuck_out_tongue_wink:
John Baez said:
One is just to say it:
Right, but it's still at least a little disappointing that switching to the sheaf perspective didn't get rid of our need to define some notion of coordinate free derivative. So in a way, while this gives me a new valuable perspective on jets and jet bundles, it doesn't solve my original question of how those coordinate free derivatives are defined in general. It also opens up a new question: given a general smooth bundle p: E -> B, its jet spaces (fibers of its jet bundle) are smooth manifolds, while the stalks of its sheaf of sections are just boring old sets (since a sheaf is defined to be a contravariant functor into the category Set). Taking the proper equivalence relation on the stalk will just return a set, not a manifold/space- so how do we get the space structure back?
Right, but it's still at least a little disappointing that switching to the sheaf perspective didn't get rid of our need to define some notion of coordinate free derivative.
It did. You were just too impatient. I said there were many ways to do this, and I started by saying the most obvious way.
A slicker way is to noticed that the set of functions whose value, first derivative, second derivative,... and nth derivative at all vanish forms an [[ideal]] in the ring of smooth real-valued functions on . Let's call it , for some mysterious reason, which I will eventually reveal.
Thus, two functions in have the same n-jet if they are equal modulo the ideal . In other words, they have the same n-jet if their difference lies in .
Thus, the space of n-jets at is the quotient ring
Even better, we can describe the ideal in a coordinate-free way.
Let be the set of smooth real-valued functions on that vanish at .
Then is the ideal generated by functions that are products of functions in . That's the mysterious reason I alluded to.
It helps a lot to see an example. Suppose and .
The function is in , since it vanishes at .
The function is in . Note that both it and its first derivative vanish at . In fact, any function that's times a smooth function will be in , by definition of "ideal generated by". Also, any function that's times a smooth function will vanish, along with its first derivative, at .
And so on. The function is in , in fact it generates this ideal, and its first derivatives all vanish at .
Btw, @John Onstead, I had to rewrite my last comment a few times, so please read the final version after this comment appears!
It's really amazing how that works using the connection between derivatives and the multiplicity of zeros!
Yes, we really exploit that old stuff in a fun way here!
@John Onstead wrote:
Taking the proper equivalence relation on the stalk will just return a set, not a manifold/space- so how do we get the space structure back?
The space of n-jets of a fiber bundle at a point is always a manifold: the value of a section is a point in the fiber , and the derivatives up to order n are described by a point in a finite-dimensional vector space. A finite-dimensional vector space is always manifold. So, the space of n-jets at is the product of two manifolds: the manifold times a finite-dimensional vector space.
In the example I just described, where we're looking at -jets of sections of a vector bundle, we see the space of -jets is actually a finite-dimensional vector space, hence a manifold.
You should be used to how we use forgetful functors in geometry by now. Don't forget all the way down to unless you have some darn good reason to! That's like the "in case of emergency break glass" of forgetful functors. Once you go down that far you have to work to claw your way back up.
John Baez said:
Thus, the space of n-jets at x is the quotient ring
C∞(U)/Ixn+1
Even better, we can describe the ideal Ixn+1 in a coordinate-free way.
Let Ix⊆C∞(U) be the set of smooth real-valued functions on U that vanish at x∈U.
Then Ixn+1 is the ideal generated by functions that are products of n+1 functions in Ix. That's the mysterious reason I alluded to.
That makes everything a lot clearer! I'll try to be more patient next time.
The only flaw I see with this is that it only works for when the set of germs of sections of some bundle over a point forms a ring, like in the above example with C^inf (U). What if the set of germs of sections does not form a ring, such as in the more general case of an arbitrary fiber bundle (where, for instance, it no longer makes sense to be able to "add" or "multiply" two sections together)? Without that ring structure, you can't mod out by an ideal- let alone define the maximal ideal in the first place!
John Baez said:
A finite-dimensional vector space is always manifold. So, the space of n-jets at x is the product of two manifolds: the manifold Ex times a finite-dimensional vector space.
Right, but that wasn't my confusion. The "space" of germs, usually given as a plain set known as the "stalk" of a sheaf, is theoretically the same as the space of infinity-jets, and thus is a product Ex times an infinite dimensional vector space, which is not canonically a manifold.
John Baez said:
You should be used to how we use forgetful functors in geometry by now. Don't forget all the way down to Set unless you have some darn good reason to! That's like the "in case of emergency break glass" of forgetful functors. Once you go down that far you have to work to claw your way back up.
I didn't mention forgetful functors, I was talking about sheaves Open(X)^op -> Set. But I think I solved this problem on my own anyways- if you have a nice enough category of smooth manifolds, you can always define the dependent product construction on any smooth bundle. This allows you to define a functor Open(X)^op -> Smooth that sends every open set of X to the smooth manifold of local sections on that open. This isn't a sheaf at all- it's not even a presheaf since Open(X)^op is not enriched over Smooth! But I still feel you can define the appropriate directed limit construction to define stalks and germs of this functor.
John Onstead said:
The only flaw I see with this is that it only works for when the set of germs of sections of some bundle over a point forms a ring, like in the above example with . What if the set of germs of sections does not form a ring, such as in the more general case of an arbitrary fiber bundle (where, for instance, it no longer makes sense to be able to "add" or "multiply" two sections together)? Without that ring structure, you can't mod out by an ideal- let alone define the maximal ideal in the first place!
Right, then the construction of n-jets has to proceed differently. You'll see the nLab provides two approaches: the concrete approach where you just come out and say a jet is an equivalence class of sections that have the same value, 1st derivative... and nth derivative at a point, and an abstract approach which I don't understand yet. There are probably middle ground approaches, too, but I have no serious objection to the concrete approach.
John Onstead said:
I didn't mention forgetful functors, I was talking about sheaves Open(X)^op -> Set.
The forgetful part is that you were talking about set-valued sheaves when in fact the thing of smooth sections of a vector bundle or fiber bundle is better than a mere set. We can talk about -valued sheaves for any category . For example, the sheaf of sections of a vector bundle is a -valued sheaf. And I believe, as you more or less said - not in quite this language - the sheaf of sections of a smooth fiber bundle should be a sheaf valued in smooth spaces.
Indeed algebraic geometers hardly ever sink to the level of discussing -valued sheaves! For them, a sheaf is usually at least -valued, and most often -valued or better. For example a [[scheme]] is typically defined as a [[ringed space]] with some properties, and a ringed space is a topological space with a sheaf
It was only Grothendieck who had the guts to start talking about -valued sheaves - which led him to invent topos theory, which algebraic geometers have still not embraced.
Anyway, you've figured this out already.
But I think I solved this problem on my own anyways- if you have a nice enough category of smooth manifolds, you can always define the dependent product construction on any smooth bundle. This allows you to define a functor Open(X)^op -> Smooth that sends every open set of X to the smooth manifold of local sections on that open. This isn't a sheaf at all- it's not even a presheaf since Open(X)^op is not enriched over Smooth!
I don't think we need a category to be -enriched to talk about -valued presheaves or sheaves on . A -valued presheaf is a functor
Then, if has a Grothendieck topology, I believe we define what it means for a -valued presheaf to be a sheaf, by copying the usual definition in the case . You may want to have enough limits and/or colimits. Or if you want to live dangerously, maybe you can just demand that the limits and colimits that appear in the definition happen to actually exist. But the category of smooth spaces is complete and cocomplete so there's no problem in this case.
So, I'm going to conjecture that if is a bundle of smooth spaces, its sheaf of smooth sections can be regarded as a -valued sheaf.
I'd love some expert to descend from on high, deus ex machina, and tell us whether the sheaf of continuous sections of a bundle of topological spaces is commonly regarded as something like a -valued sheaf. (Maybe you'd want to use a 'convenient category' of topological spaces here).
John Baez said:
I don't think we need a category C to be V-enriched to talk about V-valued presheaves or sheaves on C. A V-valued presheaf is a functor
F:Cop→V
Then, if C has a Grothendieck topology, I believe we define what it means for a V-valued presheaf to be a sheaf, by copying the usual definition in the case V=C. You may want V to have enough limits and/or colimits.
That's really interesting! All I've ever looked at up until now were Set sheaves. Those also seem to be the most intuitive- for instance, in a discussion about motivating sites from a bit ago, it was brought up that sheaves can be thought of as objects in the category "glued together" in a way that respects the Grothendieck topology (even more, a sheaf category could be realized as a sort of completion of the site category). But you don't have this anymore for a V-valued sheaf since only in Set do you have the powerful Co-Yoneda lemma that ensures you can do this kind of gluing in the first place!
John Baez said:
Right, then the construction of n-jets has to proceed differently. You'll see the nLab provides two approaches: the concrete approach where you just come out and say a jet is an equivalence class of sections that have the same value, 1st derivative... and nth derivative at a point
I guess I can't avoid it! I'll have to look more into what a "derivative" is supposed to mean in a jet bundle after all...
John Baez said:
and an abstract approach which I don't understand yet
This one seems interesting, but it might be a little above my level. It seems in some topoi there sometimes there exists an endofunctor on slice categories called the "infinitesimal disk bundle construction". It's the left adjoint to the jet bundle comonad that sends every bundle over X (in the slice category over X) to its jet bundle. It seems to be based on synthetic differential geometry, where the notion of "infinitesimal" is made formal. I was thinking about learning more about synthetic differential geometry at some point in the future!
John Onstead said:
I guess I can't avoid it! I'll have to look more into what a "derivative" is supposed to mean in a jet bundle after all...
Since manifolds look locally like , and differentiation is local, you can define the partial derivatives a of a smooth map between manifolds just as if you had a smooth map . A section of a fiber bundle is a special case of a smooth map between manifolds, so we can talk about partial derivatives of a section, and thus define the n-jet of a section.
That's a quick sketch of how it goes.
This one seems interesting, but it might be a little above my level.
Yeah, me too, even though I understand synthetic differential geometry to some extent.
Urs Schreiber and Igor Khavkine were on a quest to understand partial differential equations in physics using jet bundles, and it led them into the abstractions you find now on the nLab:
Personally I find synthetic geometry to be overkill for doing ordinary differential geometry - perhaps because I already know ordinary differential geometry, and you need to know it to read papers about it, or write papers that most geometers can read.
Synthetic differential geometry seems more profitable if you want to "seamlessly generalize the traditional theory to a range of enhanced contexts, such as super-geometry, higher (stacky) differential geometry, or even a combination of both", as Khavkine and Schreiber put it.
John Baez said:
A section of a fiber bundle is a special case of a smooth map between manifolds, so we can talk about partial derivatives of a section, and thus define the n-jet of a section.
I think I've figured it out! So let's say you have a smooth function f: M -> N. Earlier, you mentioned the differential df: TM -> TN. You can define the 1-jet of f at the point p to be the pair (f(p), dfp) where dfp: TpM -> Tf(p)N. But maybe you can keep going with this and define a k-jet of f to be an ordered list (f(p), dfp, d^2 fp, ... d^k fp) where d^k f is the nth differential (the differential applied n times) of f. With this, you can then construct the jet space by making the ordered lists for the jets of f and g equivalent (under the equivalence relation) if f(p) = g(p), dfp = dgp, d^2 fp = d^2 gp, and so on.
This might work; is the idea that is a map from to ?
(If category theorists were in charge they might have called the tangent bundle of "".)
John Baez said:
This might work; is the idea that d2f is a map from TTM to TTN?
I'm actually not sure! It could be, if by taking the second differential we mean applying the differential endofunctor twice. However, I believe it should specify to the Hessian matrix in the case where M and N are Euclidean spaces. In addition, I've also seen definitions of the second differential as actually being a multilinear map such as in this post. So I'm not exactly sure what to believe!
I know the usual approach to nth derivatives in terms of multilinear maps. But I thought you were trying to avoid it using a clever idea. I think that idea - repeatedly applying the endofunctor on the category of smooth manifolds - will work. The main reason people don't want to do this is that they don't want to think of the third derivative (for example) of a function as a smooth map - that feels too 'bullky'. They prefer to think of the third derivative as a section of some vector bundle on . But the interesting thing is that to do this in a coordinate-free way, you are forced to keep track of the 1st, 2nd, and 3rd derivative all together. That's because when you change coordinates, the lower derivatives affect the higher derivatives! (The 0th derivative does not have an effect, but we include it in the n-jet anyway.)
To understand what I mean by this, you have to look at how the Hessian of a function transforms under a change of coordinates on . You'll see the new coordinate-transformed Hessian involves not just the original Hessian of , but also the first partial derivatives of .
John Baez said:
I know the usual approach to nth derivatives in terms of multilinear maps. But I thought you were trying to avoid it using a clever idea. I think that idea - repeatedly applying the endofunctor T on the category of smooth manifolds - will work.
Right, but what is the specific difference? For instance, say I present to you a function f: M -> N. I also then present to you a function d(df): TTM -> TTN. I then present the multilinear map version d^2f: TM x TM -> TN. Lastly, I present the Hessian matrix for Euclidean space. Is there a step by step process I can follow to mechanically and systematically convert one into the other- and if so, what is the calculation? Or are all these ideas incommensurable and there isn't a way to get from one to the other?
I've also found this which seems to be helpful. According to this you can define the jet equivalence if you "require that f(p) = g(p), Tpf = Tpg, and the iterated tangent maps TTf and TTg and so on for up to k copies of T agree in the fibers over p"
First of all, as I tried to explain in my last post, there's no natural way to define that bilinear map you're calling the Hessian TM x TM -> TN.
I will rewrite this to give the example of 2nd derivatives:
John Baez said something like:
They prefer to think of the second derivative as a section of some vector bundle on . But the interesting thing is that to do this in a coordinate-free way, you are forced to keep track of the 1st and 2nd derivative together. That's because when you change coordinates, the lower derivatives affect the higher derivatives! (The 0th derivative does not have an effect, but we include it in the 2-jet anyway.)
To understand what I mean by this, you have to look at how the Hessian of a function transforms under a change of coordinates on . You'll see the new coordinate-transformed Hessian involves not just the original Hessian of , but also the first partial derivatives of .
You really have to do this calculation to understand what I'm talking about - it's one of life's big surprises. You can do the calculation for and see this effect already.
So, there's no diffeomorphism-invariant way to get a bilinear form that describes only the second derivative of a map at each point. This is why people need jets!
One can describe the shocking phenomenon I'm talking about in this way:
1) -jets of functions are sections of a vector bundle called . For each we have a natural (i.e. diffeomorphism-invariant) bundle map
where we "throw out the (n+1)st derivative information".
2) On the one hand, we have a natural (i.e. diffeomorphism-invariant) splitting
where is the trivial line bundle over . Sections of record the possible values of functions , while sections of record the possible first derivatives of functions.
3) On the other hand, there exists no natural splitting
We only have a natural epimorphism !
All the above is a fancy-ass way of saying this:
Look at how the Hessian of a function transforms under a change of coordinates on . You'll see the new coordinate-transformed Hessian involves not just the original Hessian of , but also the first partial derivatives of .
I'll have to process the above for a little bit longer. Maybe as practice I'll try to do what you suggest and directly derive the Hessian matrix from a map TTR^n -> TTR on my own, and show that in the process I'll get both the Hessian and first partial derivatives too.
It would help to define what the elements of TTM are. Starting with TM, the elements are vectors tangent to the point p, so TTM will be vectors tangent to tangent vectors to p. So elements of TTM consist of the information of two vectors: v, the original tangent vector at point p, and w, a vector tangent to the tangent vector v. So a map from TTM to TTN maps a pair of vectors, v and w, to a vector in TTN- hence, it's a bilinear map. Because the map must include both w and v, this map therefore does what you are saying and includes information about both the second and first derivatives!
Here's my attempt at some calculations. We know for a function f: R^n -> R, the differential TR^n -> TR is given by the Jacobian matrix, which for a point x = (x1, x2, ... xn) can have its terms written out in summation form as:
If we substitute this expression inside of itself we get
We can pull the sum out of the derivative by the linearity property and get
These are precisely the terms of the Hessian matrix! (I've been practicing some LaTeX, hopefully that made it a little easier to read!)
I also did the exercise suggested, but I'm still too inexperienced with LaTeX to write it all out here, it would just take too much time. But in short, the first thing I did was define a coordinate transformation for the first derivative (the Jacobian matrix), which resulted in needing to use the chain rule. This created a product of two derivative terms. Now that I had the transformation rule for first derivatives, I could plug it back into the first derivative to get the second derivative (like I did above, just with the transformation rule). But since I had a product of derivatives, I had to use the product rule, splitting the expression into two parts added together. When everything settled, these two parts in the final expression were a term describing how the second derivative transformed, and another part which described the first derivatives only. Hopefully I did everything correctly but I got exactly what you told me to expect!
GREAT!
I'm also glad you're dipping your toe into the waters of LaTeX.
Let me see if I can do it. I have some old coordinates and some new coordinates on some open set of some manifold. Then differentiation w.r.t. these two coordinate systems is related by the usual Jacobian matrix:
Here as usual I'm summing over repeated indices, and the idea is that each side of the equation is a differential operator on our -dimensional open set , so if we have any smooth function the above formula is short for this:
Now let's take a second derivative, say
and express it in terms of our original coordinates. I'm putting in the to keep from getting confused, but I'll sum over repeated indices without writing summation symbols. First I'll tackle the inside expression:
Then I'll use the product rule, aka "Leibniz law" (though analyses of Leibniz's personal journal show he got it wrong the first time):
And then I'll convert the second instance of
into
If I haven't screwed up this gives
People would normally write something more like this:
The second term here is the "expected" term: the second derivative of with respect to the new coordinates is just a linear transform of the second derivative of with respect to the old coordinates. The linear transform is built from two copies of the Jacobian matrix - it's really a tensor product of two copies of the Jacobian matrix.
But the first term shows involves the first derivative of . It arises when the Jacobian matrix is not constant: it involves the derivative of the Jacobian matrix.
So the moral, as you've already seen yourself, is that the first derivative of "leaks in" to our formula for how the second derivative transforms when we change coordinates.
We can thank the math gods for one small favor, though: the zeroth derivative of never shows up here!
This goes on for higher derivatives, and some masochists have worked out the general pattern. I forget the details. However, when we compute the effects of a coordinate transformation of the nth derivative of a function, all the lower derivatives "leak in" - except for the zeroth derivative.
So what we say is that the jet bundle of a manifold does not naturally split as a sum of vector bundles that keep separately keep track of the 0th, 1st, .... nth derivatives. It does split, since all short exact sequences of vector bundles split, but not naturally.
In fact we have a non-natural splitting
where is the symmetrized kth tensor power of the cotangent bundle.
You may be more familiar with how differential forms are sections of the antisymmetrized tensor powers .
If you haven't thought about symmetrized tensor powers of the cotangent bundle, I'll just say that they arise here from the commuting of mixed partials, e.g.:
So if we split off the second derivative information from the jet bundle , which we can do non-naturally, it's a section of .
John Baez said:
People would normally write something more like this:
I think that's essentially what I did and got. Thanks for writing it out!
John Baez said:
You may be more familiar with how differential forms are sections of the antisymmetrized tensor powers ΛkT∗M.
The symmetric power is an interesting new concept and connection!
John Baez said:
So if we split off the second derivative information from the jet bundle J2M, which we can do non-naturally, it's a section of S2T∗M
I see, this makes sense!
Great, I'm glad all this stuff makes sense. And I'm delighted that you bit the bullet and did those partial derivative calculations. They're mildly unpleasant, but I don't know any better way to get a feel for what's going on here.
I still haven't gotten around to answering how this "nth jet bundle" approach compares with your approach. I think they're interconvertible when , and in particular both contain all the lower derivative information, not just the nth derivative information.
By the way, you can quote me without clobbering my beloved LaTeX by going to the right of one of my posts, clicking on the 3 vertical dots that appear, and then clicking on "Quote and reply". Then you can just delete everything except what you want to quote.
Also by the way: when you give your manifold a Riemannian metric, I believe you get a natural way to do the splitting
But now "natural" means "invariant under all diffeomorphisms of that preserve the metric".
John Baez said:
By the way, you can quote me without clobbering my beloved LaTeX by going to the right of one of my posts, clicking on the 3 vertical dots that appear, and then clicking on "Quote and reply". Then you can just delete everything except what you want to quote.
I'll do that from now on!
John Baez said:
I still haven't gotten around to answering how this "nth jet bundle" approach compares with your approach. I think they're interconvertible when , and in particular both contain all the lower derivative information, not just the nth derivative information.
I found a post I mentioned above that seems to help in this matter:
John Onstead said:
I've also found this which seems to be helpful. According to this you can define the jet equivalence if you "require that f(p) = g(p), Tpf = Tpg, and the iterated tangent maps TTf and TTg and so on for up to k copies of T agree in the fibers over p"
This post also mentions the "standard way" of defining a jet using paths on a manifold. If and has , then, by figuring out how a map acts on each , you can find its jets. Specifically, it seems that given a function , a function f and g from M to N have equivalent k-jets if and only if the and (where is the composition ) are equal at 0 in I for all ϕ and c, for all derivatives of these functions up to the kth derivative. The reason why I find this so fascinating is because is just a function , and of course the set of all functions from I to forms a ring . Recall the problem with the germ approach was that not all maps formed a ring, but here we have a chance to turn any situation involving a generic map into one involving rings.
To do this, we can start with and define the stalk which is a ring with a maximal ideal m. We can use this to send every function to its k-jet by first taking the restriction map and then taking the projection map from the equivalence relation . Of course, not all maps from I to are of the form , but this can be resolved by identifying the composition map . In addition, we have to first restrict to the set of all functions such that , but this isn't a problem. By composing everything together, we get a function that sends every map from I to of form to its k-jet.
However, I'm worried this still doesn't completely solve the problem. I've found a way to send every function of form to its k-jet, but this doesn't guarantee I've found a way to send every function f to its k-jet. For instance, the k-jets of and are equal if the derivatives of these functions at point 0 agree up to k, but this is for a specific choice of ϕ and c. On the other hand, the definition of k-jet for a function f requires that this be true for all choices of ϕ and c. We might have some situation where all maps of form and get sent to the same jet, but this would mean not only that all (for the same choice of ϕ and c on both sides) up to the kth derivative as we want, but also that all and (for a different choice of ϕ and c on both sides) up to the kth derivatives as well, which we may not want. This means it might be possible that f and g still have equivalent jets even if derivatives of and don't all agree (thus sending the systems and to separate jets of functions), thus implying an inequivalence between jets of functions and of functions f.
Sorry for so much text above, I've just been thinking about this germ issue for the past few days and wanted to know your thoughts on it. I also did some more practice with LaTeX so hopefully this looks better!
Thanks, I can read it more easily now.
I like very much this new way of defining germs of smooth maps between manifolds in terms of germs of maps from to , which can in turn be defined using ideals as we'd discussed earlier.
This slick abstract definition seems intuitively correct to me. More precisely, it seems to match the coordinate-based definition in terms of partial derivatives. It seems easy to show that the coordinate-based definition implies this slick abstract definition. It seems harder to show the converse, namely the b) a) part here:
Conjecture. Given two smooth maps between manifolds and , the following are equivalent:
a) and all partial derivatives up to order of and agree at .
b) For every curve with and every function , the -jets of and at are equal.
Proving b) a) will require some trickery, because a) involves how and change "in several directions at once", e.g.
involves looking at how changes in both the and directions, while b) involves how and change "only along a 1-dimensional curve".
However, by cleverly choosing various curves it should be possible to prove b) a).
This sort of cleverness was already needed to show that is smooth if is smooth for every curve . On Math Stackexchange, Jonas Meyer wrote:
This was proved by Jan Boman in the paper "Differentiability of a function and of its compositions with functions of one variable", Math. Scand. 20 (1967), 249-268. Here's an online version, and here's the MathSciNet link. According to the article and review, it had been an unpublished conjecture of Rådström.
The proof looks hard, so this is one of those examples where you have to pay a price for a very elegant definition: you have to really sweat to prove that it has all the consequences you want!
John Onstead said:
This means it might be possible that f and g still have equivalent jets even if derivatives of and don't all agree (thus sending the systems and to separate jets of functions),
This is true, but I don't think it's 'bad'.
thus implying an inequivalence between jets of functions and of functions .
My conjecture tries to state the equivalence correctly, and you're not giving a counterexample to that here.
John Baez said:
The proof looks hard, so this is one of those examples where you have to pay a price for a very elegant definition: you have to really sweat to prove that it has all the consequences you want!
Wikipedia is actually the one that uses this elegant definition of b to define the jet of a function M -> N. That's why I said this was more of a "standard" way to go about it. I don't think it's too hard to show that b implies a, because tangent bundles can actually be defined in terms of equivalence classes of curves on a manifold as well. In fact, this is exactly what the Wikipedia article uses as a basis for eventually coming up with the more general jet definition. As a neat consequence, one can actually define the tangent bundle of M in terms of the 1-jet bundle of functions from R to M!
John Baez said:
My conjecture tries to state the equivalence correctly, and you're not giving a counterexample to that here.
I'm not questioning the conjecture that b implies a, since Wikipedia already makes that clear. I'm asking a separate question based on the assumption b and a are equivalent, about how to use condition b to construct a jet space via the germ definition. Maybe I can be more clear about what specifically I am asking if I condense down what I wrote above. Here's the essence of what I want to know: Given the smooth function that sends every map from to of form (where ) to its k-jet, is the image of this function, in every case, precisely isomorphic to the jet space ? That is, does the function send not only to its k-jet, but also to the k-jet of itself? If not, why, and how specifically would you then get from one to the other?
John Onstead said:
John Baez said:
The proof looks hard, so this is one of those examples where you have to pay a price for a very elegant definition: you have to really sweat to prove that it has all the consequences you want!
Wikipedia is actually the one that uses this elegant definition of b to define the jet of a function M -> N. That's why I said this was more of a "standard" way to go about it.
That's interesting.
I don't think it's too hard to show that b implies a, because tangent bundles can actually be defined in terms of equivalence classes of curves on a manifold as well.
It's not trivial to show b) implies a), for the reason I explained earlier. I agree that it's easy for 1-jets, thanks to the fact that tangent vectors are defined as equivalence classes of curves. But try doing it for 2-jets!
I now think I see how to do it, but I believe it does require a trick.
John Onstead said:
John Baez said:
My conjecture tries to state the equivalence correctly, and you're not giving a counterexample to that here.
I'm not questioning the conjecture that b implies a, since Wikipedia already makes that clear. I'm asking a separate question based on the assumption b and a are equivalent, about how to use condition b to construct a jet space via the germ definition. Maybe I can be more clear about what specifically I am asking if I condense down what I wrote above. Here's the essence of what I want to know: Given the smooth function that sends every map from to of form (where ) to its k-jet, is the image of this function, in every case, precisely isomorphic to the jet space ? That is, does the function send not only to its k-jet, but also to the k-jet of itself? If not, why, and how specifically would you then get from one to the other?
I still don't understand what you are talking about here. You seem to be hoping that the k-jet of is the k-jet of . To me that doesn't even parse, because they're elements of different sets. It's also seems unlikely that anything constructed from the composite would depend only on , as you seem to be hoping here.
Well recall earlier I was trying to use the equivalence class of germs to define a jet space, but it failed because it requires the stalk to have a ring structure to work; with this new definition of a jet we can express jets in terms of functions I -> R that do have a stalk that forms a ring. So my hope was that we can use this fact to define even a generic jet space in terms of some modulus of germs by a maximal ideal ring.
Here's my rationale for why I was thinking this approach I outlined might work. A) Two functions and have the same k-jet at a point if for . B) The definition given above states that two functions and have the same k-jet at a point if for all and . Noting the similarity between A and B, it made me think that I could just "plug in" for and 0 for into A to exactly reproduce B. Hopefully that makes my intentions clearer!
Yes, your intentions are clear. I believe you need to do some calculations to show a) b); it's not a case of mere 'plugging in'. I believe showing b) a) requires a clever trick.
To see what I mean by "some calculations",
1) compute in terms of the first derivatives of and , using the chain rule,
and then
2) compute in terms of the first and second derivatives of and , again using the chain rule.
The second one, which relies on the first, is more fancy.
I've tried doing some work on my problem but I can't seem to get anywhere for now. So I'll table it for now and we can revisit later?
For now, it is time to move the discussion back towards the case of why I brought up jet bundles in the first place: to describe PDEs. In category theory, an "equation" is defined to be a pullback square. Given an equation f = g, f and g are always some sort of morphism into the same object, the pullback object itself is the "object of solutions" to the equation, and the pullback square exhibits the equality of the equation. Different kinds of equation are just this setup internal to different categories. A PDE usually has the form where D is a differential operator and y is a section of some bundle. Generally, a differential operator is a morphism ; since we can think of 0 as the zero morphism on , the equation where y is an element of is given by two morphisms into . Taking the pullback gives us the space of solutions to the PDE exactly as we would expect.
With a complete picture of what an "equation" is already perfectly in place in the setting of fiber bundles, why do we need jet bundles? How do jet bundles describe PDEs and why is this "better" than the way I used above?
John Onstead said:
I've tried doing some work on my problem but I can't seem to get anywhere for now. So I'll table it for now and we can revisit later?
I feel like doing a bit of calculating myself. Let's consider the simple case of two functions . What is the nth derivative of their composite?
For the first derivative we have the beautiful-looking and very general chain rule
But what about the second derivative? For this I think it pays to unpack the chain rule a bit: in the case of it says
which is the form most people first learn. This works well for computing the second derivative!
Compute:
where in the last step we used the chain rule again.
It's a bit mysterious but it shows us a couple of things right away:
The second derivative of depends not only on the second derivatives of and , but on their first derivative. This points yet again to the usefulness of packaging the first and second derivatives into a single entity, the 2-jet. (The first time we saw this was in computing how the second derivative changes under coordinate transformations.)
The second derivative of involves the square of the first derivative of , because to compute it we used the chain rule for twice. We can expect that for higher derivatives this pattern will continue (but become more elaborate).
When we use jets, I suppose this 'second derivative chain rule' is given a slick general form
I've never really studied this, but if the 2-jet is a functor , this must be true!
However, it's important to be able to peek under the hood and see how the slick general form reduces to
in the one-variable case, and something similar but more fancy in the multivariable case. Somehow this equation must be contained in how acts on morphisms, and how morphisms between jet bundles are composed!
With a complete picture of what an "equation" is already perfectly in place in the setting of fiber bundles, why do we need jet bundles? How do jet bundles describe PDEs and why is this "better" than the way I used above?
I don't personally use jet bundles in my work. I also don't think of equations as pullbacks except for equations of the form if I'm doing category theory I often think of equations of the form as defining equalizers.
But anyway: to write an equation in terms of a pullback or equalizer, one needs to have some objects and morphisms between them - and for PDE some of those objects are jet bundles. That's where partial derivatives live.
Tangent and cotangent bundles are good for understanding the geometry of first derivatives: for example, without cotangent bundles it would be hard to understand why 'phase spaces' in physics are 'symplectic manifolds' - a topic important in the Hamiltonian formalism. Similarly, jet bundles are good for understanding the geometry of higher derivatives. People like to use them in studying the calculus of variations, Noether's theorem in field theory, etc.:
I find this variational bicomplex stuff quite interesting: it's a way of formalizing many things physicists go, and also doing new things. But I've never really mastered it, and I seem to limp along okay. (Most physicists don't know what a jet bundle is.)
John Baez said:
I don't personally use jet bundles in my work. I also don't think of equations as pullbacks except for equations of the form if I'm doing category theory I often think of equations of the form as defining equalizers.
It's a good thing that in many well behaved categories these concepts are interchangeable!
John Baez said:
But anyway: to write an equation in terms of a pullback or equalizer, one needs to have some objects and morphisms between them - and for PDE some of those objects are jet bundles. That's where partial derivatives live.
This makes sense. Under the "abstract" definition we discussed above for the jet comonad, the co-kleisli category of the jet comonad is the category with the morphisms as differential operators. This means that a differential operator is a morphism from J^k (E) for some bundle E to some other bundle F within the category of bundles over a manifold. Pulling back (or taking the equalizer) using these morphisms is then doing a similar thing to what I was doing with just using spaces of sections. Somehow this must yield the "closed embedded submanifold" of the jet bundle that is supposed to represent the PDE. At least, I'm guessing.
All this sounds about right to me. I've never taken this approach to PDE so I'm just guessing. I've spent years writing papers about relativistic nonlinear PDE - but I was working as an ordinary mathematical physicist, so I never thought about jet bundles, pullbacks, comonads and all that jazz. I was a humble laborer working out in the fields. (Classical and quantum fields.)
Ah, I was hoping to ask in the future about how Lagrangian mechanics works with jet bundles. I'll read the paper on the variational bicomplex you linked to, maybe it will have some answers for how to do that!
With that covered, I think we should move back to gauge theory. In particular, reviewing particular equations like Yang-Mills. I'll review where we left off there and I'll be back later to discuss.
John Onstead said:
Ah, I was hoping to ask in the future about how Lagrangian mechanics works with jet bundles. I'll read the paper on the variational bicomplex you linked to, maybe it will have some answers for how to do that!
I can probably make it up: a Lagrangian is a smooth function on a jet bundle.
But I'd certainly prefer to talk about gauge theory, e.g. Yang-Mills theory.
I was reviewing your book on page 254 to re-learn about the Bianchi identity. It gives a guide for how to naturally extend a connection on a vector bundle to a connection on its endomorphism bundle. However, it does so in a very concrete way and so I wanted to come here to ask about what the abstract, category theory POV way of doing this was. So first, you write about how to extend a connection on E to a connection on E* , which to me implies the existence of a morphism F in the category of vector bundles of the form such that composing a connection, viewed as a section of (and thus a morphism into) , with F would give a section of that precisely corresponds with the transferred connection. Is there a name for this morphism F and how do you derive it?
Secondly, given a connection of E* , you can "tensor" the connections together to get a single connection on the bundle , which is what we want since this is the same as End(E). However, it can't just be a straightforward product of the connections-as-section-morphisms in VectBund, since then the destination of the resulting product morphism would be which looks like a hot mess, and not the expected result of . How do you resolve this?
I won't fully answer your question, but I feel like setting it into a nice context.
Let be the groupoid of finite-dimensional vector spaces and invertible linear maps. We can think of an endofunctor as a "systematic way to turn vector spaces into new vector spaces". Examples include the functor sending a vector space to , or to . I'm using the [[groupoid core]] since both covariant and contravariant functors on a category become functors on its core.
Any endofunctor acts on the groupoid $$ of vector bundles over a fixed manifold . This requires proof, but the idea is that we can take a vector bundle and create a new vector bundle by applying to each fiber of .
Next, any endofunctor acts on the groupoid of vector bundles with connection over a fixed manifold .
So, given a vector bundle with connection, the vector bundle gets a connection.
To prove this in a slick way, it's good to have a very category-theoretic approach to connections on vector bundles. There are probably several, but what leaps to mind is the one that Urs Schreiber and I described in our paper Higher Gauge Theory. I'm not saying this is the best; it's just the one I know best.
John Onstead said:
.... first, you write about how to extend a connection on E to a connection on E* , which to me implies the existence of a morphism F in the category of vector bundles of the form such that composing a connection, viewed as a section of (and thus a morphism into) ,
This approach sounds risky. A connection on is not a section of . As I've mentioned a couple times, it's only a difference of two connections on that's a section of . A connection is not a section of some bundle.
You might be able to deal with this somehow, but I don't see how. It seems better to work with some description of what a connection actually is, not a description of one connection relative to another.
I think if we're operating at roughly this level of abstraction, it's perfectly fine to do what I did in my book, which is to give a formula for a connection on in terms of one on .
I think if we're going to a higher level of abstraction, it might pay to describe a connection as a kind of functor, which is what Urs and I did.
John Baez said:
I think if we're going to a higher level of abstraction, it might pay to describe a connection as a kind of functor, which is what Urs and I did.
I'm not convinced this construction is any different than the usual definition of a connection in terms of a 1-form (a section of the bundle described). The functorial connection is a functor from the path groupoid of a manifold to the Lie group, viewed as a one object category, that assigns every path (a morphism in the path groupoid) to a group element (a morphism in the Lie one object group). Note how I said group element, not torsor element! In other words, the functorial definition of a connection is just as dependent on a choice of trivialization as the 1-form definition. The only difference seems to be that in the 1-form definition, the holonomy map is derived from the connection 1-form, while in the functorial definition it is taken as fundamental instead (with the functor actually being the holonomy map). Please let me know if I'm misunderstanding anything!
John Baez said:
This approach sounds risky. A connection on is not a section of . As I've mentioned a couple times, it's only a difference of two connections on that's a section of . A connection is not a section of some bundle.
I understand this point you are trying to make, I'm just not (philosophically) convinced by it! If connections are some mysterious entity that can't take on a concrete form until I specify a trivialization, then I will much prefer to define a connection to be what you call a "connection difference" since at least that's concrete in nature. There's also good mathematical reasons to believe there's only ever a "connection difference"- unlike the curvature, a connection is affected by gauge transformations. Thus, unlike the curvature where an absolute notion of curvature is possible, it is only ever possible to define the connection difference, which is why I drop the "difference" and just call it a "connection" since that's the only way I could mean it!
John Baez said:
Let be the groupoid of finite-dimensional vector spaces and invertible linear maps. We can think of an endofunctor as a "systematic way to turn vector spaces into new vector spaces".
Why the core of Vect and not just Vect itself? Both the examples you provide are endofunctors on "plain" Vect already.
John Baez said:
Next, any endofunctor acts on the groupoid of vector bundles with connection over a fixed manifold .
So, given a vector bundle with connection, the vector bundle gets a connection.
I'm not sure I follow. I can see how a functor on Vect will correspond to one on VectBund, since, as you described, you can just apply the action of the functor fiberwise. This is what I've been assuming I could do anyways, so it's good to have confirmation of that. But I don't see how a functor on Vect will correspond to one on the category of vector bundles with connection. You can assign multiple different connections to the same vector bundle, so the functor won't have a unique choice of where to send your vector bundle with connection. This is also why functors on Set don't correspond to functors on Grp or Mon or any other category of sets with extra structure- there's multiple ways to add on the extra structure, and functors require a unique choice of target for each object.
(Also I'm planning to ask the question about defining jet bundles in terms of germs on Stack Exchange to see if anyone there can help out. Just wanted to let you know, and also ask if you had any tips or things to watch out for as it's one of the first times I will be posting on there)
John Onstead said:
John Baez said:
I think if we're going to a higher level of abstraction, it might pay to describe a connection as a kind of functor, which is what Urs and I did.
I'm not convinced this construction is any different than the usual definition of a connection in terms of a 1-form (a section of the bundle described). The functorial connection is a functor from the path groupoid of a manifold to the Lie group, viewed as a one object category, that assigns every path (a morphism in the path groupoid) to a group element (a morphism in the Lie one object group). Note how I said group element, not torsor element! In other words, the functorial definition of a connection is just as dependent on a choice of trivialization as the 1-form definition. The only difference seems to be that in the 1-form definition, the holonomy map is derived from the connection 1-form, while in the functorial definition it is taken as fundamental instead (with the functor actually being the holonomy map). Please let me know if I'm misunderstanding anything!
The group element that the path gets sent to is the difference in torsor elements between the beginning and end of the path. This construction is not dependent on a trivialization because instead of taking a difference of a torsor element and a "god-given" torsor element at the same point it is taking a (path-dependent) difference of torsor elements at different points with none being "god-given".
I'm not convinced this construction is any different than the usual definition of a connection in terms of a 1-form (a section of the bundle described). The functorial connection is a functor from the path groupoid of a manifold to the Lie group, viewed as a one object category, that assigns every path (a morphism in the path groupoid) to a group element (a morphism in the Lie one object group). Note how I said group element, not torsor element! In other words, the functorial definition of a connection is just as dependent on a choice of trivialization as the 1-form definition.
You're right, that functorial definition is just as dependent on a choice of trivialization as the 1-form definition. I was talking about a different functorial definition, where we use a (smooth) functor from the (smooth) groupoid of paths to the (smooth) groupoid of fibers of the principal bundle. The latter groupoid has fibers as objects and -torsor maps between functors as morphisms.
Maybe this latter definition doesn't actually appear in my paper with Urs! It's been so long that I forget.
John Baez said:
You're right, that functorial definition is just as dependent on a choice of trivialization as the 1-form definition. I was talking about a different functorial definition, where we use a (smooth) functor from the (smooth) groupoid of paths to the (smooth) groupoid of fibers of the principal bundle. The latter groupoid has fibers as objects and -torsor maps between functors as morphisms.
Wait, what? That definition is trivialization-dependent? I thought it was only gauge-dependent! I must be severely confused about something. Does a connection not assign holomorphies as group elements to closed paths, independent of trivialization?
If connections are some mysterious entity that can't take on a concrete form until I specify a trivialization, then I will much prefer to define a connection to be what you call a "connection difference" since at least that's concrete in nature.
There's nothing mysterious about a connection; it's just not a section of a bundle. There are several popular definitions of connection. Here
1) A connection on a vector bundle is an operator that takes a vector field and a section of and gives a new section of , obeying a few laws:
2) We can package the above as a 1-form on valued in differential operators on the sections of .
3) A connection on a principal -bundle is a smoothly varying choice of 'horizontal subspace' for each , which is complementary to the vertical subspace (the kernel of the projection ), and invariant under the action of on .
4) A connection on a principal -bundle is a smooth functor from the smooth groupoid of lazy paths in mod thin homotopy to the smooth groupoid of fibers of and -torsor maps between these, where a path from to gets mapped to a -torsor map from to .
There are probably others too, but these are the ones I mainly use. We can also talk about connections on general fiber bundles. Here's one definition of those:
5) A connection on a fiber bundle is a smoothly varying choice of 'horizontal subspace' for each , which is complementary to the vertical subspace (the kernel of the projection ).
and there's also a functorial definition, which I've never seen anyone discuss.
John Onstead said:
John Baez said:
Let be the groupoid of finite-dimensional vector spaces and invertible linear maps. We can think of an endofunctor as a "systematic way to turn vector spaces into new vector spaces".
Why the core of Vect and not just Vect itself? Both the examples you provide are endofunctors on "plain" Vect already.
Actually neither nor is a functor from to . That's why I brought in the core. The former is a 'contravariant' functor - a functor from to - while the latter isn't even that.
Ah, I see where I was confused: A map of torsors produces a group element only if it is an endomap. So, yeah, that thing is trivialization-dependent.
John Baez said:
You're right, that functorial definition is just as dependent on a choice of trivialization as the 1-form definition. I was talking about a different functorial definition, where we use a (smooth) functor from the (smooth) groupoid of paths to the (smooth) groupoid of fibers of the principal bundle. The latter groupoid has fibers as objects and $G$-torsor maps between functors as morphisms.
Ah, that does clarify things, sorry for the confusion. I don't believe I've seen this definition before, and it does make connections on principal bundles slightly less mysterious!
John Baez said:
There's nothing mysterious about a connection; it's just not a section of a bundle.
2) We can package the above as a 1-form on valued in differential operators on the sections of .
3) A connection on a principal -bundle is a smoothly varying choice of 'horizontal subspace' for each , which is complementary to the vertical subspace (the kernel of the projection ), and invariant under the action of on .
Option 2) seems to be a definition of covariant derivative. But isn't this dependent on the definition of a connection 1-form A as per the definition ? Meanwhile I believe you can express option 3 (or 5) as an "Ehresmann connection", which is a section of the jet bundle !
John Baez said:
Actually neither nor is a functor from to . That's why I brought in the core. The former is a 'contravariant' functor - a functor from to - while the latter isn't even that.
is of course functorial by two metrics! First, we know that , so if the latter is a (contravariant) functor, and obviously tensor products are, then the whole thing is a functor by composition (though I believe when passing to the opposite category you have to turn the tensor product into a "co-tensor product"). In addition, is just the internal hom: !
John Onstead said:
John Baez said:
I was talking about a different functorial definition, where we use a (smooth) functor from the (smooth) groupoid of paths to the (smooth) groupoid of fibers of the principal bundle. The latter groupoid has fibers as objects and -torsor maps between functors as morphisms.
Ah, that does clarify things, sorry for the confusion. I don't believe I've seen this definition before, and it does make connections on principal bundles slightly less mysterious!
It's not so well-known, so I can easily understand why you didn't read my mind the first time! I think you can find some version of it here:
Abstract. Parallel transport of a connection in a smooth fibre bundle yields a functor from the path groupoid of the base manifold into a category that describes the fibres of the bundle. We characterize functors obtained like this by two notions we introduce: local trivializations and smooth descent data. This provides a way to substitute categories of functors for categories of smooth fibre bundles with connection. We indicate that this concept can be generalized to connections in categorified bundles, and how this generalization improves the understanding of higher dimensional parallel transport.
The technical details may differ a bit from what I said: for example, there are various technically different ways to define a groupoid of smooth paths. But the spirit should be the same.
I wrote:
2) We can package the above as a 1-form on valued in differential operators on the sections of .
You wrote:
Option 2) seems to be a definition of covariant derivative. But isn't this dependent on the definition of a connection 1-form A as per the definition ?
No, it's not. In 2) we're saying a connection is a 1-form valued in differential operators on sections of , such that for any vector field on we get an operator obeying
There's nothing about an -valued 1-form here! Nor is any trivialization of required here.
When we trivialize - and not before we do this! - we can define an operation on sections of . Then we can define an -valued 1-form on by
Then we get the formula
But this splitting of into two parts, the part and the part, is trivialization-dependent. I'd say that is more fundamental, because it's defined independent of a trivialization of .
John Onstead said:
is of course functorial by two metrics! First, we know that , so if the latter is a (contravariant) functor, and obviously tensor products are, then the whole thing is a functor by composition (though I believe when passing to the opposite category you have to turn the tensor product into a "co-tensor product"). In addition, is just the internal hom: !
Both these attempts suffer from the problem that
is contravariant in and covariant in . So while there's perfectly fine functor
mapping to , it turns out that there is no functor
mapping to . That is, having defined this would-be functor on objects by
,
there's no way to define it on morphisms to get a functor from to itself.
Of course this requires proof, and I haven't provided a proof, so there's an implicit puzzle here: how do we prove it?
There is, however, a functor
sending to . That is, having defined this would-be functor on objects by
,
we can define it isomorphisms in and get a functor from to itself.
There's another implicit puzzle here: how do we define on isomorphisms? I'll just say it doesn't depend on any special features of ; it works for any symmetric monoidal closed category... and I'm sticking in 'symmetric' just to be super-careful; I don't think it's needed.
The point of all this baloney, lest we lose sight of it, is that parallel transport along a path always gives an isomorphism from one fiber to another. So, we expect any construction we can do to fibers, which is functorial with respect to isomorphisms, can serve as a way to build a new bundle from an old one, and also turn a connection on the old one into a connection on the new one.
(In reality we need not just functoriality but also 'smoothness', since a connection can be seen as a smooth functor from the path groupoid to the groupoid with fibers as objects and suitable smooth maps between fibers as morphisms.)
John Baez said:
The point of all this baloney, lest we lose sight of it, is that parallel transport along a path always gives an isomorphism from one fiber to another. So, we expect any construction we can do to fibers, which is functorial with respect to isomorphisms, can serve as a way to build a new bundle from an old one, and also turn a connection on the old one into a connection on the new one.
That's a good summary!
John Baez said:
So while there's perfectly fine functor
mapping to
Ah, that must have been what I was thinking of, my bad!
John Baez said:
It's not so well-known, so I can easily understand why you didn't read my mind the first time! I think you can find some version of it here:
- Urs Schreiber and Konrad Waldorf, Parallel transport and functors.
Thanks, I'll check this out!
John Baez said:
But this splitting of into two parts, the part and the part, is trivialization-dependent. I'd say that is more fundamental, because it's defined independent of a trivialization of .
I think I understand! I'm just confused about one other thing- the above statement "When we trivialize - and not before we do this! - we can define an operation on sections of ". But the operation is just the exterior derivative- it doesn't need a trivialization, or really for that matter a bundle at all to work. Maybe you meant to say "When we trivialize - and not before we do this! - we can define an operation on sections of " instead?
You can define of a smooth function on a manifold without any extra structure. But now we are talking about defining on a section of a vector bundle . To do this, we choose a trivialization of , i.e. an isomorphism between and . After doing this, and only after doing this, we can identify a section of with an n-tuple of smooth functions . This lets us define of a section of by
But if we changed the trivalization, we'd get a different operator for sections of !
So: given a vector field , the covariant derivative of a section of can be defined using only connection on .
But when we write it this way:
we need to have trivialized to know what the -valued 1-form actually is! And this, in turn, determines what the -valued 1-form is.
If we change the trivialization, and change, but itself does not.
By the way, if you don't mind covariant exterior derivatives we can write the above equation as
or even as
or maybe more tersely as
but these variants don't change the essential point: the way of splitting the left side into two parts depends on a choice of trivialization.
By the way, in my book on pages 226-227, I write the operator for sections of a trivialized vector bundle as , and I call it the standard flat connection associated to the trivialization. That is probably safer than overloading the meaning of . But lots of people write it as .
That's very interesting! It seems that acts as a sort of partial derivative, and so requires local coordinates to make sense, which can only be specified after the trivialization. At least I hope I'm understanding that right!
I wanted to return to the issue that started this discussion in the first place, which was defining how to extend a connection on to one on and :
John Baez said:
Next, any endofunctor acts on the groupoid of vector bundles with connection over a fixed manifold .
So, given a vector bundle with connection, the vector bundle gets a connection.
This implies that given an endofunctor on - of which the dual space and endomorphism space functors are- there's somehow also then an endofunctor on that sends a vector bundle equipped with a connection to the dual bundle/endomorphism bundle equipped with the "correct" connection. But as I pointed out above, a functor on the category of vector bundles won't uniquely extend to a functor on the category of bundles with connection, since this is extra structure that can be specified multiple times for the same object. So how would this problem be resolved?
John Baez said:
The point of all this baloney, lest we lose sight of it, is that parallel transport along a path always gives an isomorphism from one fiber to another. So, we expect any construction we can do to fibers, which is functorial with respect to isomorphisms, can serve as a way to build a new bundle from an old one, and also turn a connection on the old one into a connection on the new one.
:point_up:
To be more precise: you can see a connection as a smooth functor from points and smooth paths into , giving for each point the fiber and for each path the holonomy, and you can postcompose by your other functor to get a new connection on your new bundle. It's not that every functor on the category of vector bundles extends to a functor on the category of bundles with connection, it's that the way you're deriving these functors from functors on automatically also gives you a blessed extension.
John Onstead said:
That's very interesting! It seems that acts as a sort of partial derivative, and so requires local coordinates to make sense, which can only be specified after the trivialization. At least I hope I'm understanding that right!
I don't think we need coordinates, though I get what you mean, and in my exposition I used coordinates because it was easy.
If I were trying to work coordinate-free, I'd say that is a concept that makes sense for map , but to treat a section of a vector bundle over as a map for some vector space , we need to choose a trivialization , identifying each fiber with .
Then from a section we can extract a function , and we can take of that getting , but then we can use the canonical isomorphisms to extract from this a map which is a -valued 1-form. We may then sloppily call that -valued 1-form "".
In my exposition I chose and used coordinates just to make the explanation a lot quicker!
TL;DR: we can take of a section of a trivialized vector bundle and get a vector-valued 1-form. But the answer depends on the trivialization.
Another way to put this is that every trivialized vector bundle has a god-given flat connection called .
A difference of connections is an -valued 1-form. Thus, given any other connection on our trivialized vector bundle , the difference
is an -valued 1-form, and we can write
.
But and thus depend on the choice of trivialization. And of course most vector bundles don't have any trivialization.
John Onstead said:
John Baez said:
Next, any endofunctor acts on the groupoid of vector bundles with connection over a fixed manifold .
So, given a vector bundle with connection, the vector bundle gets a connection.
This implies that given an endofunctor on - of which the dual space and endomorphism space functors are- there's somehow also then an endofunctor on that sends a vector bundle equipped with a connection to the dual bundle/endomorphism bundle equipped with the "correct" connection. But as I pointed out above, a functor on the category of vector bundles won't uniquely extend to a functor on the category of bundles with connection, since this is extra structure that can be specified multiple times for the same object. So how would this problem be resolved?
We don't try to do such an extension; as you point out, it's too nonunique.
We should probably start by talking about the important functor
Here is my sloppy notation for the category of finite-dimensional vector spaces, and is the category of (finite-dimensional) vector bundles over a fixed manifold .
Here's the idea behind this functor . Any decent procedure for getting new (finite-dimensional) vector spaces from old ones gives a procedure for getting new vector bundles from old ones. Just apply the procedure to each fiber!
After we understand , we can move on to the important functor
Here is the category of vector bundles with connection over a fixed manifold .
Any decent procedure for getting new (finite-dimensional) vector spaces from old ones gives a procedure for getting new vector bundles with connection from old ones!
Btw, congrats on the LaTeX! The easiest way to learn LaTeX is to copy little bits of other people's LaTeX and pay attention to how it works.
James Deikun said:
To be more precise: you can see a connection as a smooth functor from points and smooth paths into , giving for each point the fiber and for each path the holonomy, and you can postcompose by your other functor to get a new connection on your new bundle. It's not that every functor on the category of vector bundles extends to a functor on the category of bundles with connection, it's that the way you're deriving these functors from functors on automatically also gives you a blessed extension.
Thanks, that makes sense!
Earlier, I was given the definition for a principal connection in this way, so I'd been wondering how to define a vector bundle connection functorially- this helps clear that up too!
John Baez said:
If I were trying to work coordinate-free, I'd say that is a concept that makes sense for map , but to treat a section of a vector bundle over as a map for some vector space , we need to choose a trivialization , identifying each fiber with .
Then from a section we can extract a function , and we can take of that getting , but then we can use the canonical isomorphisms to extract from this a map which is a -valued 1-form. We may then sloppily call that -valued 1-form "".
From this it seems you are implying that is just the differential we were talking about earlier, where for a function , . But if this is the case, then why can't you just define for a section of a vector bundle given by to be the differential ? In other words, why go through all the business of converting the section back into a map into before taking the differential when you could just directly take the differential instead?
Edit: Thought about this for a bit longer- maybe you can take the differential as I described, but it won't be a 1-form, and we need it to be?
Also, the above seems to strongly imply that , and thus the formula , only works for a trivial bundle, and doesn't work for the more general locally trivial bundle. If so, then how do things work in the latter situation? To me the formula "means" that the covariant derivative is the usual derivative plus a "correction factor", given by the connection 1-form A, that takes into account the curvature of the space. Is this interpretation of a covariant derivative not always possible then?
It's always possible locally, but sometimes the "usual" derivative isn't globally definable, and the "correction" gets the job of bridging between the different local derivatives.
James Deikun said:
To be more precise: you can see a connection as a smooth functor from points and smooth paths into , giving for each point the fiber and for each path the holonomy, and you can postcompose by your other functor to get a new connection on your new bundle. It's not that every functor on the category of vector bundles extends to a functor on the category of bundles with connection, it's that the way you're deriving these functors from functors on automatically also gives you a blessed extension.
Thanks for explaining the idea. We've got this groupoid of paths in , say , and a vector-bundle-with-connection is a functor
obeying a certain smoothness condition. Since is a groupoid, this is the same as a smooth functor
Then, given any smooth functor
we can form
which is a new vector-bundle-with-connection.
So, any smooth functor gives a functorial way to get new vector-bundles-with-connection from old ones!
If we want to get really serious we should talk about the smoothness conditions someday. Then we can also do this example:
We can think of as giving a smooth groupoid with as its space of objects and only identity morphisms, say . A vector bundle on should be the same as a smooth functor
Thus, we can compose with to get a new vector bundle
So, any smooth functor gives a functorial way to get new vector bundles from old ones!
Thanks James Deikun and John Baez for your help! I'm still updating my notes on all this, it's a lot to take in. But while I do that, I have a question that came to mind when we were discussing second derivatives above. Given an exterior covariant derivative, we denote for a curvature form . As we covered, the curvature is interpreted to precisely be the measure of how much failure there is for the exterior covariant derivative to square to zero as a "normal" exterior derivative would. But this reminds me a lot of something I learned in first year calculus class- that the second derivative is a measure of the "curvature" of a function. By any chance, are these two concepts related in any way? Or is it just a weird coincidence that the second covariant exterior derivative and the second usual derivative both yield some notion of "curvature"?
It's the same business: first derivatives describe the linear - i.e., flat - approximation to something like a function, or section of a bundle, or submanifold of a manifold, or Riemannian manifold, while second derivatives describe the 'deviation from flatness': the failure of the linear approximation to be exactly correct. Curvature is the failure of something to be exactly flat.
There are also third derivatives, which measure even subtler kinds of bending, and so on.
A good way to bring this stuff down to earth is to look at a 2-dimensional manifold with a Riemannian metric. If its Riemann curvature tensor is zero, it's flat: it looks locally exactly like a plane. If its curvature tensor is nonzero, we get a better local approximation to its shape using an ellipsoid ('positive curvature') or hyperboloid ('negative curvature').
In general relativity, the 'principle equivalence' says that to first order we can always locally approximate spacetime by Minkowski spacetime - so very small and short-lived experiments in free fall behave almost like they do in Minkowski spacetime, which is flat. But when we go to second order, we see effects of curvature - i.e., gravity.
You could say it's all about Taylor series. Physicists like to joke that "to first order, everything is linear". You can go quite far linearizing everything. But then there should be another saying: "to second order, everything is quadratic". That's where curvature shows up.
Hi! My apologies for dropping off, things have been getting busier.
In preparation for better understanding Lagrangians, I've been reviewing the calculus of variation and what it "looks like" in this fiber bundle/manifold perspective. Calculus of variations centers around "functional derivatives", which are derivatives of functionals. First, I want to make sure I understand what a functional "looks like" in the category of smooth manifolds- is it sufficient to say it is simply a smooth map of the following form ? This would take in a real valued function defined on the manifold and return a real number- this is basically a functional, right? If so, then is a functional derivative simply the differential of this map: ?
However, I've also seen the notion of a "functional derivative" formalized in terms of Gateaux Derivatives, which according to Wikipedia generalize directional derivatives. But I thought differentials generalized directional derivatives? Maybe I'm not understanding something correctly!
"Functional derivatives" have been formalized in many ways - Gateaux derivatives, Frechet derivatives, and others. But until you need to learn about these, it's probably easier to work in your favorite cartesian closed category of smooth spaces (e.g. diffeological spaces), and say something like what you said:
In physics a "field" is often described as a smooth section of a fiber bundle over a smooth space . The space of all smooth sections is a smooth space in itself, say . Given any function we can define its functional derivative to be its differential , which in turn gives a 1-form on also often called .
(I said a field is often described as a smooth section of a fiber bundle, because we've already seen a case where it's not: a connection is not a smooth section of a fiber bundle until we fix a 'reference connection' and describe other connections in terms of that one.)
Thanks, that makes sense! And I guess generalizing derivatives to differentials or to Frechet/Gateaux derivatives are just different directions of generalization. They both start in Euclidean space, but one generalizes to manifolds and differential geometry while the other generalizes to topological vector spaces and formal analysis.
I've been learning so much about gauge theory and fiber bundles for the past few weeks and it's been really informative! I've certainly appreciated all the help along the way. But I think most of the mathematical basis has been covered (either by this discussion or the book), so this might be a natural pausing place for this topic. That's not to say there isn't still a lot to go through- mainly specific applications to physics, but I'll come back to those at a later point in time. Let me know if there's anything you think is important to remember about the mathematics of gauge theory that we might not have gone over yet!
As for what to do next, I wanted to talk more about the motivations of sheaves, inspired by the recent discussion of sites and their motivation. I'd only passively been interested in sheaves, but my interest in learning about them, and how they relate to locality, has been excited by our previous discussion on germs and how they relate to jets. So far, I've seen sheaves be motivated as providing help in "local-global problems", but I want to make this notion more concrete by identifying the exact class of local-global problems that sheaves have use in, and clearly distinguish this class from the class of local-global problems where sheaves cannot help (I'm talking very generally here, so even applying outside the domain of topology). Hopefully I will be able to do this at some point within this next discussion! I'll probably start a new topic for that when I have gotten my questions in order.
As usual I want to keep going deeper: talking about the concepts of bundle, connection and curvature without talking about Yang-Mills theory or general relativity is like eating an appetizer and then leaving the restaurant before dinner is served! But that's okay: if you feel you've had enough that's fine.
John Baez said:
As usual I want to keep going deeper: talking about the concepts of bundle, connection and curvature without talking about Yang-Mills theory or general relativity is like eating an appetizer and then leaving the restaurant before dinner is served! But that's okay: if you feel you've had enough that's fine.
Don't worry! I'm excited to learn more about these concepts, I'm just taking a quick break for the time being to get to some other topics I was curious about. But I will certainly circle back!
Okay. I will try to hang back and let other people do most of the talking about sheaves, since a lot of people here understand sheaves better than me, particularly sheaves on general sites and their topos-theoretic aspects. I'm mainly a student of sheaves, especially their applications to complex geometry and algebraic geometry a la Griffiths and Harris' Principles of Algebraic Geometry.
Actually, before I go on to sheaves, I was reviewing my notes and discovered another question I wanted to ask. Above, we saw how composing a vector-bundle-with-connection-as-a-functor with some allows one to "transfer" connections functorially. This reminded me of a discussion we had a while ago about transferring connections- not between a bundle and its dual or endomorphism bundle, but instead between a principal bundle and an associated bundle. This made me wonder if there's an analogous way to view that transfer as a composition of functors, exactly like what we did above.
Take a principal G-bundle with connection given by . The above would require there to be, for every representation , a functor such that the composition of and would precisely yield the associated vector bundle already equipped with the appropriate connection. However, I can't seem to define this functor . As mentioned in Urs' article, there's certainly a functor since a representation of a group is just a functor from the one object category into vector spaces. So while we can compose a functor with , the functor is the definition of a principal bundle with connection assuming a trivialization, as we discussed above. My question is then: is there any way to define the transfer of a connection in a functorial way, as I am trying to do above, without reference to a trivialization, or is that just completely impossible to do? That is, does the functor as I defined it above exist, is well-defined, and is easily constructible?
(Also, Happy Halloween!)
:pumpkin: Happy Halloween, @John Onstead! :pumpkin:
And thanks for a very engaging conversation.
This is a great question. Let's see if I can figure out an answer. I'll just concentrate on how we associate a vector bundle to a principal bundle, somehow turning each fiber of the principal -bundle into a vector space using a representation of .
Say I have a representation of the Lie group , i.e. a Lie group homomorphism
where is a finite-dimensional vector space and is its Lie group of automorphisms, usually called the [[general linear group]] .
Let be a -torsor. Then we can "associate" to it a vector that people often call . I'll describe its underlying set: it's the quotient of by the equivalence relation
In other words, we have a coequalizer diagram in :
But in fact has the structure of a vector space! That's because elements of are equivalence classes , but because is a torsor every such equivalence class is equal to a unique one of the form , and we can define vector space operations by
where and or .
So, your desired functor
sends any -torsor to the vector space .
Of course I should really check that the vector space structure on is well-defined, and define your desired functor on morphisms, and check that it's really a functor! But I'm a lazy guy so won't.
By the way, I didn't solve this problem in the most elegant and general way. It actually has very little to do with vector spaces!
I think the real key is that
In other words, the category of -torsors is equivalent to the one-object category with elements of as morphisms anf multiplication as composition!
This makes it easy to turn a representation of on a vector space into a functor from -torsors into vector spaces!
But this slick approach matches the clunkier 'standard' approach I described before.
John Baez said:
So, your desired functor
sends any -torsor to the vector space .
Of course I should really check that the vector space structure on is well-defined, and define your desired functor on morphisms, and check that it's really a functor! But I'm a lazy guy so won't.
Oh thanks I think that solves my problem! I'm pretty sure this functor is well defined since I think it's actually doing the opposite of what we noticed above. There, we realized an endofunctor in could "lift" to an endofunctor on (with the same being true for endofunctors on the core of G-Tor "lifting" to those on the core of the category of G-principal bundles). In this case, we can go the opposite direction: we can start with the usual associated bundle functor between the category of G-principal bundles and vector bundles given some representation, restrict this to a functor between the cores of both respective categories, and then show that this functor "lowers" to one between the core of G-Tor and Vect. It would suffice to show that the functor on the bundles acts fiberwise, which I'm confident it does (maybe it can be seen as "swapping out" fibers in some way?)
Yes, the point of the associated bundle construction is that it swaps out each fiber in our principal bundle, which is a -torsor , with some other fiber , where , called the standard fiber, is some chosen thing on which acts.
We always have , but this isomorphism is not canonical. If it were, the associated bundle would always be trivial!
John Baez said:
I think the real key is that
In other words, the category of -torsors is equivalent to the one-object category with elements of as morphisms and multiplication as composition!
I noticed this too- since the core of G-Tor is a connected groupoid it would "collapse" to a single object under the skeleton construction. But the problem here is that while the categories are equivalent, I don't think there is a canonical equivalence- it's basically the higher up version of how there's no canonical isomorphism for torsors! It seems that each choice of equivalence thus corresponds to a choice of trivialization. But I could be completely wrong about all this!
Actually there's a god-given functor which does this:
And this is an equivalence of categories!
So we should really think of as just another way of thinking about . I should probably have said this about two months ago. It's a way of thinking about where the faceless single object is replaced by a bunch of objects that have a bit more personality. Torsors feel more like "interesting mathematical structures" than a boring old , a random symbol I've drawn on the page. But they're all isomorphic so in some sense this is a sham. What's really interesting is the morphisms between them!
This shift of viewpoint, while trivial in a way, is actually quite profound in how it makes us think about things differently. I don't think geometers could get interested in principal bundles if they didn't think of torsors as "interesting mathematical structures" - namely, nonempty spaces with a free and transitive action of .
John Baez said:
Actually there's a god-given functor which does this:
- it sends the one object to the -torsor (with its usual right action on itself)
- it sends any morphism (which is just an element of ) to the -torsor morphism (that is, left multiplication by ).
I'm still slightly confused. A functor involves a mapping of homs . But this is just a map , but earlier we stated there isn't a canonical map from a group into the automorphism group of a torsor! Maybe I'm misremembering something, since that discussion was a while ago?
John Baez said:
It turns out that if you have a principal bundle , each fiber for is a torsor of . The automorphism group of this torsor is isomorphic to , but not canonically, and in many cases this prevents us from identifying gauge transformations with smooth functions . (Not only is there not a canonical identification, there's none at all.)
This is what I mean!
John Onstead said:
I'm still slightly confused. A functor involves a mapping of homs . But this is just a map , but earlier we stated there isn't a canonical map from a group into the automorphism group of a torsor!
That's true - but note in my functor I am sending not to just any old random torsor, but to itself, viewed as a right -torsor! The automorphism group of this -torsor is isomorphic to in a standard way, since its automorphisms are just left translations by elements of .
So this functor is god-given as can be. Maybe your point amounts to this: there's not a god given weak inverse to this functor! A weak inverse would need to send an automorphism of an arbitrary -torsor to an element of .
(By weak inverse I mean 'inverse up to natural isomorphism, so an equivalence is a functor with a weak inverse , but there are several ways to think about this: we can demand the mere existence of the weak inverse, or a specified weak inverse , or a specified weak inverse together with a choice of natural isomorphisms .)
So, this is a subtle issue... and we should expect all these irritatingly subtle issues to bite us in the ass when we are trying to delve into the very essence of symmetry, which is all about how things can be 'the same'.
However while there are various ways to choose all the data of an equivalence
I am merely claiming that such an equivalence exists. If we pick one, we get a way to turn any action of on any object in any category , meaning a functor
into a functor
This lets do the "associated bundle" trick starting from a principal -bundle and an action of on anything in any category.
We can now have the fun (or pain) of studying how this construction depends on our choice of equivalence .
Ah I see, thanks for the explanation!
I know I said I'd take a break from bundles but then another question popped into my head, and I think this one might be a good example for contextualizing all of what we've covered so far. Given a symmetry of nature, Noether's theorem states there exists a corresponding conserved quantity. Secondly, the conservation law can be given as a "continuity equation", either in integral or differential form (via the divergence theorem, these two are equivalent). In differential form, it's written as where is the conserved current.
So here's my question. Let's say you are handed a random group , corresponding to some "symmetry of nature". You can of course construct the principal -bundle for this group on something like Minkowski space. But Noether's theorem seems to guarantee that, somehow, just by being given this group and bundle, at some point down the road you will arrive at some equation of the form - not only that, but you will arrive here in a non-arbitrary way. My question is: how? What is the step by step procedure of realizing Noether's theorem/getting from to from the fiber bundle perspective, making use of things we've covered like associated bundles, connections, etc. as needed? (though of course the specific connection and representation shouldn't matter since Noether's theorem needs to apply in all cases) I know Noether's theorem makes the most sense in the context of Lagrangian mechanics, but my goal here is to get a general sense of how it works in the context of fiber bundles alone without reference- or at least as little reference as possible- to Lagrangians or Lagrangian mechanics.
You're probably not totally going to get that since Noether's theorem is something that's true at the level of mechanics, not dynamical systems in general.
John Onstead said:
So here's my question. Let's say you are handed a random group , corresponding to some "symmetry of nature". You can of course construct the principal -bundle for this group on something like Minkowski space. But Noether's theorem seems to guarantee that, somehow, just by being given this group and bundle, at some point down the road you will arrive at some equation of the form - not only that, but you will arrive here in a non-arbitrary way.
No, you won't get a 1-form with starting from so little information. Noether's theorem says that if you have some fields on spacetime, and a Lagrangian that depends on these fields, and acts on these fields in such a way that the Lagrangian is invariant up to a total divergence, then you get a 1-form , and this obeys when the field equations coming from the Lagrangian hold. But depends on the Lagrangian, and you need to use the field equations to show .
It's probably good to look at the statement and easy proof of Noether's theorem. Here is a version for fields of the form where and are any manifolds. There are more general versions but the argument is always similar.
Ok, so Noether's theorem does require the Lagrangian after all. I think I was confusing a few things. First, I mistakenly thought you could derive a law for any representation and connection since you can derive a curvature and Bianchi identity for any representation and connection, and I know that Maxwell's equations (which include a continuity equation for charge) can be derived from the Bianchi identity. But I'm guessing I got this wrong probably because the Bianchi identity isn't the only component of Maxwell's equations, you also need the Yang-Mills term. I'll certainly go back and review that again!
I also think I confused two notions of "charge". There's "charge as a representation" where you can index certain representations of a bundle by a number known as the "charge" which you've mentioned above. But then there's "charge as a physical property" which can have a density across space and be a field in itself. I conflated these two when in reality they are two separate things! It seems that in the case of the former, in QFT, the choice of "charge" would dictate how much electric charge the particles of the QFT would possess (so, for the electron field, the charge representation would be 1, while for a quark it might be 2/3 or 1/3). Meanwhile, the latter is used for when you have a charge density across space, such as when you have a lot of electrons together moving around in space.
John Onstead said:
Ok, so Noether's theorem does require the Lagrangian after all. I think I was confusing a few things. First, I mistakenly thought you could derive a law for any representation and connection since you can derive a curvature and Bianchi identity for any representation and connection...
The Bianchi identity says that the exterior covariant derivative of some -valued 2-form, the curvature , must vanish. This is just a fact about connections and their curvature. But to state conservation of charge in the form requires that you cook up a 1-form on -dimensional spacetime. And this quantity, called the current, depends on fields other than the connection, in a manner that depends on the Lagrangian.
I know that Maxwell's equations (which include a continuity equation for charge) can be derived from the Bianchi identity.
No, they can't. If they could, Maxwell's equations would be a mathematical tautology rather than a law of physics that needs to be checked experimentally!
It's true that half of Maxwell's equations are a mathematical tautology when formulated in terms of , and this half is the Bianchi identity:
But it's the other half, the non-tautological Maxwell equations, that give the actual physics:
including the continuity equation
Here I'm writing everything down in sufficient generality that it applies to Yang-Mills theory - see page 261 of my book. If you only want Maxwell equations, take to be a trivial line bundle and .