You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
I've been working on learning some of the introductory ideas of algebraic geometry. To my understanding, in this context there is a duality between algebra and geometry:
In the extreme, there is a correspondence between maximal ideals (large and algebraic) and "smallest points" (small and geometric).
Intuitively, this seems like a manifestation of the duality between information and ambiguity. The more one knows about something, or the more that one observes something, the less ambiguity remains regarding what that something can possibly be. So if I pick some mystery geometric region and gradually tell you more and more polynomials that vanish on that region, then you will be able to gradually sharpen your estimate of my chosen mystery region.
Because I care about medical imaging, I care about this duality between information and ambiguity in the case where the information provided is imperfect. In this case, one's estimates of what is present are also imperfect. Motivated by this, I'm curious if algebraic geometry has been developed in the direction of exploring the duality between algebraic and geometric structures in the case where the algebra is "probabilistic" in some sense.
As an initial guess, perhaps one could try to define a probabilistic notion of a polynomial. Instead of being able to assess if a polynomial is zero or not at a point in question, we would instead be given the ability to assess the probability that a "probabilistic polynomial" is zero at some point in question.
To summarize, I have two specific questions:
I'd be really interested if anyone has done something like this, but I feel you're stretching algebraic geometry in ways it doesn't want to go. Since you mention "information" and "probability", I'd be tempted instead to advertise information theory as the power tool for questions like yours.
Given two random variables and , we can mathematically define three quantities:
Then there's an equation
This says how much learning the value of reduces how much you don't know concerning the value of , on average. That's how I'd be tempted to formalize the "duality" you are sensing.
It takes time to understand these concepts, but anyone who wants to understand the universe needs to learn about them - and they are very general and beautiful. Luckily they are explained on Wikipedia (at the links), and even better here:
The equation I wrote down is equation (2.39) in this book.
Thanks for reminding me of those concepts! Back in engineering grad school, I spent a little bit of time using these concepts to try and think about ultrasound imaging. I even spent some time with that book by Cover and Thomas. Ultimately other work was progressing towards publication more quickly and so we abandoned the idea. But it was interesting stuff! It might be fun to go back to it, now that I know some more math...
(We were specifically playing around with these concepts in simulation, trying to see if it was possible to optimize the length of an ultrasound pulse in 1D, given some statistical assumptions about the medium we're transmitting it through. (The longer the transmitted pulse the stronger your resulting observed signal and so the more robust you are to noise, but also the more autocorrelation crosstalk you get). I forget exactly what we were optimizing though... presumably we were trying to minimize the conditional entropy regarding the target given our resulting observations!?)
Btw, if you ever want to see how entropy emerges naturally from category theory, there's an explanation in Tom's free book:
And I have my own free book on entropy in physics. There's remarkably little overlap in these three books, and a lot of basic stuff not covered by any of them, but Cover and Thomas is the one you want for the issues I was just mentioning.
I'll mention algebraic statistics without thinking too hard if it's truly an answer to the question.
The section "Recent work using commutative algebra and algebraic geometry" might help @David Egolf come up with a good answer to his question, but I'm afraid it will take work: it's certainly not handing out anything on a silver platter.
From the nlab: "Tropical geometry is often thought of as algebraic geometry over the tropical semiring." This tropical semiring is the min-plus rig and is used in statistical inference, for example here: The Generalized Distributive Law...
Joe Moeller said:
I'll mention algebraic statistics without thinking too hard if it's truly an answer to the question.
Thanks for linking that page! On that page, there are several things that look interesting to me. For example, from that article I found a paper by David Mumford called "Pattern Theory: The Mathematics of Perception".
The section "Recent work using commutative algebra and algebraic geometry" is also interesting, and probably relevant to what I asked about in this thread.
Roughly, the idea sketched in that section appears to me to be this: If we have a discrete probability distribution determined by some unknown parameter, then at least in some cases the individual probability values in each possible distribution are those that satisfy some polynomial equation. So we can then study our collection of possible probability distributions by studying where a certain polynomial goes to zero - which can presumably be done using algebraic geometry.
In this approach, it appears that we construct (standard) polynomials of interest from some probabilistic thing of interest. That's not a perfect match with what I was wondering about in this thread, but it's still interesting.
Simon Burton said:
From the nlab: "Tropical geometry is often thought of as algebraic geometry over the tropical semiring." This tropical semiring is the min-plus rig and is used in statistical inference, for example here: The Generalized Distributive Law...
This also looks quite interesting! I'll plan to leave a lengthier comment here once I've spent some time looking at this paper.
A bunch of people have written about how the tropical rig is the limit of a rig that's important in statistical mechanics at temperature . I first learned about this from James Dolan and I could easily bloviate about it at length. I wrote a short version here:
But one really broad lesson is that a lot of algebraic geometry can be generalized from commutative rings to commutative rigs and even more general commutative Lawvere theories - Lawvere theories where all the operations "commute" in a suitable sense. (It takes a while to figure out what it means for a 3-ary operation and a 5-ary operation to "commute", but the answer is visually satisfying.) Nikolai Durov has a mammoth book generalizing lots of algebraic geometry (e.g. schemes) from commutative rings to commutative Lawvere theories, but part of why this book is so huge is that he assumes the reader knows little of category theory, and proceeds at a slow pace.
In cryptography, polynomials are used to encrypt data. More specifically, cryptographers use ideals to introduce noise (randomness) into the encrypted data. To draw an analogy with your problem, @David Egolf, encrypted data can be compared to raw data awaiting analysis.
Below, I will denote by some valuable information that one wants to discover. In the context of encryption, is a message that an attacker wants to discover, but maybe in your case @David Egolf, is some information that you want to discover through some analysis (just to give some ideas here).
Now, the idea is to have a polynomial that encrypts/hides the information . In general, this polynomial will be considered modulo a certain polynomial such that it makes sense to denote . An important detail to keep in mind here is that the inequality will likely hold with high probability. However, we assume that is small enough to have .
The polynomial is usually encoded as a sum such that . This means that if you know , then you can discover using the subtraction . In other words, the quantity will likely be the key focus of your research analysis in order to learn .
Still, encryption doesn’t stop there. The encryption is usually hidden by an extra polynomial (this is the noise), which would belong to an ideal of the form where is some other polynomial. In other words, we have where .
In practice, we also assume that is small enough such that .
Depending on the size of the noise , two scenarios can occur:
A few remarks:
Not exactly on topic but perhaps worth noting that multilayer perceptrons with ReLU activations instantiate tropical polynomial maps. Meanwhile, transformer blocks at zero temperature are also (essentially?) algebraic maps. So large language models and their ilk at zero temperature are distributions formed from tropical polynomials.
Thanks everyone! All these comments look interesting! It will, however, take me some time to get to them all.
Simon Burton said:
From the nlab: "Tropical geometry is often thought of as algebraic geometry over the tropical semiring." This tropical semiring is the min-plus rig and is used in statistical inference, for example here: The Generalized Distributive Law...
I'm still working on reading this paper, but I already learned something pretty interesting from it! This paper tells me that one can form a "commutative semiring" (which we might also call a "commutative rig") from any distributive lattice. The nLab tells me that any Heyting algebra forms a distributive lattice. And "Sheaves in Geometry and Logic" tells me that if is any object in a (elementary?) topos then the subobjects of form a Heyting algebra.
So, any object in any (elementary?) topos comes with an associated commutative rig, where the elements in the rig are the subobjects of !
Yes, that's fun. I believe a distributive lattice is the same as a commutative rig where x + x = x and x x = x for all x.
The idea is that "or" is like +, but it obeys "x or x = x", and similarly "and" is like , but it obeys "x and x = x".
That is very cool! So presumably we can do some amount of logic with the subobjects of an object of an elementary topos.
Wrapping back around to algebraic geometry, I guess we use this "logic" when talking about subsets of some set that vanish under different polynomial ideals.
In that setting, we have some nice relationships. For example, where is the subset of some set where some ideal of polynomials on vanishes, and is the ideal generated by and .
I wonder if one could try and develop this setup using subobjects of some object in a topos different from . So then we would presumably want this: , where is the product in the lattice of subobjects of some object in some topos .
For example, one might do this if one wished to study "curves" where each "curve" is a subgraph or a subpresheaf instead of a subset.
(This idea catches my attention in part because I have the idea that it may be interesting to represent "knowledge states" in the context of imaging as presheaves.)
Anyways, I should probably spend a lot more time on standard algebraic geometry before trying out something like this!
John Baez said:
Yes, that's fun. I believe a distributive lattice is the same as a commutative rig where x + x = x and x x = x for all x.
John Baez said:
The idea is that "or" is like +, but it obeys "x or x = x", and similarly "and" is like , but it obeys "x and x = x".
Discussion on whether this is true or not has been moved to #learning: questions > commutative rigs, distributive lattices, etc.
However I think people in that thread have come up with a commutative rig where and for all but which is not a distributive lattice. Apparently one needs to also have the condition that is absorbing for , which then automatically implies that .
I'm still working on reading the above paper "The Generalized Distributive Law". It's slow going because I keep getting interested in spin-off thoughts :laughing:.
So far I don't yet see how the tropical rig is important in this paper - I'm still looking forward to getting to that.
A couple interesting tidbits from the paper so far, though:
(I suppose a similar thing potentially holds true for any equation that holds in an algebraic structure).
Re distributive laws between monads: they are directed, so for example if you have a rig freely generated by some elements you can write any element of the rig as sum of products of the , but not as a product of sums of the .
In other words, "expanding" a product of sums is a systematic process, but "simplifying" a sum of products is not so straightforward.
John Baez said:
In other words, "expanding" a product of sums is a systematic process, but "simplifying" a sum of products is not so straightforward.
That makes sense! With regards to saving on computation, we can imagine that is faster to compute (due to having fewer multiplications) compared to . And it is very easy to rewrite as , but potentially much trickier to rewrite as . So if we want to save on computation, we need to work for it!
I want to work a bit on this thread again. I'm currently working towards learning how the tropical rig can be used in statistical interference, by reading the paper "The Generalized Distributive Law" linked above. (And then I want to take a look at some of the other interesting comments made above! There's so much to learn!)
I made a little more progress in reading that paper, and thought this was pretty interesting. The authors introduce the "MPF problem", which refers to a situation where one has a product of functions and then wishes to compute a "marginalized" version of it, by adding up all the values that function attains as some input ranges over all possible values.
This is interesting because various famous and useful things have a related MPF problem. For examples, the paper gives the Hadamard transform and the Fourier transform (over any finite abelian group). But I think that any (finite) convolution or cross-correlation will also give an example!
(Cross-correlation in particular is of interest to me, because it can play an important role in forming ultrasound images from "coded" transmissions. Basically, we send out a particular pattern of ultrasound and then look for that pattern in the echoes we get back using a cross-correlation.)
I continue to slowly work through this thread! I'm continuing to read in "The Generalized Distributive Law" because I want to see how the tropical rig can show up in the context of probabilistic reasoning.
The paper gives a interesting example relating to this, which I'll share here.
Let us imagine that we receive a seven digit binary sequence over some channel. One of several predefined "codes" were transmitted, but there is some probability that each transmitted digit got swapped to the opposite digit (i.e. 0->1 or 1-> 0) during transmission. So our goal is to figure out which code was transmitted, given the (possibly modified) code that we receive.
Given a transmitted code , we model the likelihood of having received as the product of the likelihood of receiving each digit individually: . Given the observed , our goal is to find the that maximizes this probability.
To think about how to maximize the above, we can equivalently think about how to minimize . We can then consider attempting to minimize this quantity with respect to only some of the , so that we get a function in the remaining variables.
For example, we might end up with this function:
We notice that this function involves the "addition" (minimum) and "multiplication" (regular addition) of the tropical rig.
(Note that for this optimization problem to be interesting, we have to restrict the binary sequences we are searching over to the valid codes. The paper achieves this by using parity check functions.
So, we probably instead want to minimize the following with respect to all the . Namely, we want to minimize . Here each checks if a given sequence passes a particular parity check, outputting if not and if it is valid. So the together check if a code is valid. But at any rate, we are still seeing the tropical rig operations show up here.)
Jumping in without checking history: have you seen https://www.pnas.org/doi/10.1073/pnas.0406010101
JR said:
Jumping in without checking history: have you seen https://www.pnas.org/doi/10.1073/pnas.0406010101
Thanks for the link! I have not read that article yet, but I'll plan to take a look at some point! On a related note, I recently found the article "Algebraic, tropical, and fuzzy geometry", which might be relevant too.
One last thought from "The Generalized Distributive Law": if we have multiple random variables and we only observe a value for some of them, then we are left with a "region" that holds all the still-possible values for the random variables.
Moving on from the (enjoyable and interesting) "The Generalized Distributive Law", I see the next comment above talks about the tropical rig in the context of statistical mechanics.
So I will take a look at this post next: The space of physical frameworks (part 1) !
That post describes a way to put a rig structure on given a bijection . I'm now curious if this works in general!
Let be a bijection and let be a rig. Can we make a rig as follows?
Since is a bijection, the proposed addition and multiplication are well-defined.
What should the zero element of be? We can try . Then as desired.
We can also try . Then .
Next, we check associativity. . Since we have , this becomes . I think a similar argument should work for multiplication.
It remains to check distributivity. We have . Similarly, we should have .
So it looks like we can indeed get a rig structure on in this way! That's pretty cool! I'm now wondering if this is just a special case of a more general strategy for inducing algebraic structures.
Hmm, I'm guessing that the rig we get is always isomorphic to the rig we started with . (I'm guessing that bijective rig morphisms are in fact isomorphisms in the category of rigs). If so, that makes this a bit less exciting.
However, the article manages to take some kind of limit (of the rigs we get in the above way as our bijection changes in a parametrized way) to get a new rig. That's interesting! I would not have expected that a bunch of isomorphic rigs could "approach" a different rig in some sense.
(The new rig we get is the tropical rig!)
David Egolf said:
Let be a bijection and let be a rig. Can we make a rig as follows?
That works not just for rigs but for anything mathematicians might want to make a set into: groups, fields, topological spaces, quasitriangular Hopf algebras over , you name it.
Hmm, I'm guessing that the rig we get is always isomorphic to the rig we started with, .
Yes, that's how it works. We call this "transporting a structure along a bijection".
Most mathematicians take these general facts for granted after having checked them in 5 or 6 examples. But stating these general facts precisely and making them into a theorem is a job for category theorists!
I believe these fact amount to saying that if is any 'reasonable" category of sets with extra structure, the forgetful functor is an [[isofibration]]. So then the challenge is to explain why forgetful functors tend to be isofibrations. I could say more....
However, the article manages to take some kind of limit (of the rigs we get in the above way as our bijection changes in a parametrized way) to get a new rig. That's interesting!
Yes, that's the only interesting thing about this business.
I would not have expected that a bunch of isomorphic rigs could "approach" a different rig in some sense.
This type of thing happens a lot, and there's a branch of math called deformation theory that studies it. In special relativity, for example, it doesn't really matter what the speed of light is: for any value of you get an isomorphic version of special relativity. But when you take the limit you get a nonisomorphic theory, namely Newtonian mechanics.
It was noticed by Wigner in 1953 (if not sooner by someone else) that new theories of physics often have older theories as limits in this sense. All these theories are actually associated to mathematical structures - for example, most theories have symmetry groups associated to them. So Wigner developed a theory of how to take a limit of a bunch of isomorphic groups and get a different group.
I think my blog post listed a bunch of physical theories that reduce to other theories in some limit.
It's very cool to learn that deformation theory exists. It appears that it's closely related to algebraic geometry, which currently makes it hard for me to understand many of the examples. But the basic idea of something different somehow happening in a limit is pretty magical (in a good way)!
I believe these fact amount to saying that if is any 'reasonable" category of sets with extra structure, the forgetful functor is an [[isofibration]]. So then the challenge is to explain why forgetful functors tend to be isofibrations. I could say more....
"Isofibration" is one of those words I've wanted to understand for a while. This might provide a great motivating example! I see the nLab says that an isofibration is roughly a functor such that every isomorphism in can be "lifted" to an isomorphism in .
In our case, we have a forgetful functor , and we roughly want to be able to induce an isomorphism between objects in given an isomorphism between their underlying sets. So if I have a structure with underlying set for some set , then we hope we can lift this bijection to an isomorphism in . We want to lift this -isomorphism to a -isomorphism between and some preimage of under .
We could ask: when is the forgetful functor for a monad an isofibration? Here is the Eilenberg-Moore category of the monad involving , which I think can often be thought of as a category of "algebraic structures".
If you're comfortable with the definition of you might try to think through this in that generality for yourself!
David Egolf said:
It's very cool to learn that deformation theory exists. It appears that it's closely related to algebraic geometry, which currently makes it hard for me to understand many of the examples.
That's just because algebraic geometers like studying things in continously varying families depending on some parameters, like ellipses
which depend on two parameters . So, given anything they enjoy seeing how they can fit it into a family and 'deform' it, like taking a circle and squashing it into an ellipse. So naturally they infect deformation theory with their jargon and make it hard for other people to understand. :upside_down:
But physicists and other people also use deformation theory, often in blissful ignorance of the power tools of algebraic geometry. So there's stuff to read that doesn't assume algebraic geometry, but usually it's about some particular case of deformation theory.
But the basic idea of something different somehow happening in a limit is pretty magical (in a good way).
Indeed! This is how you get special relativity from general relativity, and classical mechanics from quantum mechanic and (I show in blog series) classical thermodynamics from statistical mechanics.
David Egolf said:
We could ask: when is the forgetful functor for a monad an isofibration? Here is the Eilenberg-Moore category of the monad involving , which I think can often be thought of as a category of "algebraic structures".
As Kevin hinted, this is an excellent and not very hard question. Maybe the hardest part is remembering the definition of an isofibration! :wink:
By the way, I tend to throw you lots of 'exercises' since you seem to have an appetite for them, but if you ever just want to know the answer (like to your question above), just ask. There's a limit to how many exercises anyone can do!
I like exercises/puzzles - they help me understand things! In this case, I'm feeling a bit intimidated by all the fancy words. But it is encouraging to know that this isn't supposed to be too hard... Let me see what I can come up with.
I will bite my tongue, then.
I think I figured it out! Let in be an isomorphism from to . Here is our forgetful functor. I believe that is always(!) an isofibration.
To show this, we need to find an isomorphism in that maps down to by . Such an isomorphism would be a commutative square of this form, where and are both isomorphisms:
commutative square
If we set , we know that is an isomorphism because functors preserve isomorphisms. And we can make the diagram commute by setting . This isomorphism maps down to under - as just grabs the lower part of this diagram - so I think is an isofibration!
(In the above example, we might think of as a group, as its underlying set, and as a set.)
Yes, that sounds right! Great! The formula generalizes how you might transport any binary operation along a bijection to get a new binary operation :
So this shows that when I said any 'reasonable' structure on a set (or indeed an object in any category) can be transported along bijections (or isomorphisms), this includes all structures that come from monads.
There are many 'reasonable' structures on sets that don't come from monads, like 'being a topological space'. So one might want an even more general theorem.
But let me instead describe an 'unreasonable' structure. Define a 'goofy structure' on sets like this:
If the set is , a goofy structure on is a way of making it into a group.
If the set is anything not equal to , a goofy structure is a way of making it into a field.
The "goofy structure" I suppose is "unreasonable" because the goofy structures we can put on a set can be essentially different even for isomorphic sets. So this concept is violating the principle of equivalence.
One might wonder how one can detect if a notion of structure is "reasonable" in the sense that it doesn't lead to essentially different structures on isomorphic objects.
Well...how about, if the forgetful functor from the category of those structures is an isofibration? :grinning:
Yes! I was just about to propose something like that! That seems pretty reasonable to me.
I'd say that's exactly the conceptual content of the notion of isofibration, so I think we're on the same page.
We could then say that a category of (reasonable) structures on the objects in a category is a category equipped with an isofibration .
I might just complicate that with the "stuff, structure, property" triumvirate--if is not faithful, I probably wouldn't want to call a category of structures on But (one wonders!) surely there are non-faithful isofibrations.
Yes, e.g. the projection (either one) is an isofibration.
Of course, that was just meant to be a temptation to David to think about it.
Indeed, every functor is equivalent (in an appropriate sense) to an isofibration; one might enjoy learning about the canonical model structure on the category of small categories in this light.
Oh, sorry.
No worries, my fault for being coy.
David Egolf said:
The "goofy structure" I suppose is "unreasonable" because the goofy structures we can put on a set can be essentially different even for isomorphic sets. So this concept is violating the principle of equivalence.
Right! But here's an interesting twist: the concept of 'isofibration' itself violates the principle of equivalence. You can have two functors
that are naturally isomorphic, with one being an isofibration and the other not. Let me see if I can make up an easy example. The identity functor
is an isofibration (since all identity functors are). But it's naturally isomorphic to a functor that sends all -element sets to the same -element set. (Yet another puzzle: see why?) This latter functor is clearly not an isofibration.
So if you call the violation of the principle of equivalence "evil", and you think being an isofibration is not evil, i.e. "good", you're in the funny situation that it's evil to say a functor is good!
In fact, every functor can be 'repaired' to get an isofibration, without changing anything important.
More precisely, given a functor
we can make into a subcategory of a larger category , where the inclusion
is an equivalence, and find an isofibration
such that
The idea is that we puff up the category to get a bigger but equivalent category , which allows us to replace the functor by a functor that's an isofibration, basically because has enough isomorphisms to do all the lifting we want in the definition of 'isofibration'.
John Baez said:
But here's an interesting twist: the concept of 'isofibration' itself violates the principle of equivalence. You can have two functors
that are naturally isomorphic, with one being an isofibration and the other not. Let me see if I can make up an easy example. The identity functor
is an isofibration (since all identity functors are). But it's naturally isomorphic to a functor that sends all -element sets to the same -element set. (Yet another puzzle: see why?) This latter functor is clearly not an isofibration.
That makes sense! (And I think I was able to work out the details of checking that we have a natural isomorphism between two such functors.)
John Baez said:
The idea is that we puff up the category to get a bigger but equivalent category , which allows us to replace the functor by a functor that's an isofibration, basically because has enough isomorphisms to do all the lifting we want in the definition of 'isofibration'.
Interesting! Intuitively, I guess the repair process creates a version of that better respects the principle of equivalence, in terms of what is in its image.
I think I was able to work out the details of checking that we have a natural isomorphism between two such functors.
Excellent! It's closely related to the idea of a [[skeleton]]: every category is equivalent to a category where isomorphic objects are equal, and you can set it up so that the functor in this equivalence is the inclusion of a subcategory, gotten by choosing one object from each isomorphism class of objects in .
When we do this trick for , we usually choose the objects of to be the ordinals - our 'favorite' -element sets.