You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Strongly encouraged by @John Baez , here it is the new engaging discussion topic for today. One can think perhaps that it is a bit old-fashioned (1969), but actually it should be a good one... And I love the word "Universe" . Anyways, I was reading https://link.springer.com/chapter/10.1007/BFb0059147 by Saunders Mac Lane and some other great considerations of Feferman and wondering: Is this the end of the story, namely ZFC with a universe? What happened after 1969?
Well, Martin-Löf developed his dependent type theory in 1971 - 1976 and then people showed in the 2010s that the types in his type theory actually model infinity-groupoids.
Indeed, and what about predicativity?
Federica Pasqualone said:
Indeed, and what about predicativity?
Martin-Löf's type theory as usually presented is usually predicative in the sense that power sets cannot be constructed. But it's not the same kind of predicative as that found in classical mathematics. They usually accept function sets but not excluded middle, so function sets do not imply power sets.
With universes the type theory would have "power sets" defined relative to but the "power sets" themselves would still be external to so would behave more like power classes from set theory.
I am wondering if we want to elaborate a little more on the notion of predicativity, maybe there are also people less familiar with logical foundations, idk.
Ok, I will do it - but guys do not be shy to contribute here and/or ask!
As I learned it, remember the paradoxes after Dedekind & Cantor promoted their set theory? They arise more or less all from predicativity issues! or better: impredicativity!
About predicativity I'll quote a conversation I had with Josselin Poiret from another thread:
Madeleine Birchfield said:
In the context of predicativism I think one has to make a distinction between "power sets exist but aren't small" and "power sets don't exist in the theory at all". I don't really consider the former to be predicative because you can still do everything you can in impredicative mathematics, only with the additional bureaucracy of keeping track of size or universe levels.
The latter is much more restrictive since certain constructions (i.e Dedekind reals via Dedekind cuts) aren't really possible without power sets.
Josselin Poiret said:
Madeleine Birchfield said:
I don't really consider the former to be predicative because you can still do everything you can in impredicative mathematics, only with the additional bureaucracy of keeping track of size or universe levels.
you cannot: see what I mentioned about strong normalization of System F, or the existence of complete lattices
Josselin Poiret said:
impredicativity manages to bump up the logical strength of the theory
Madeleine Birchfield said:
Josselin Poiret said:
you cannot: see what I mentioned about strong normalization of System F, or the existence of complete lattices
Complete lattices in Agda/Coq/book HoTT are a size issue as well, they are defined complete relative to a universe in the same way that Grothendieck toposes are defined to have all coproducts relative to a universe.
Josselin Poiret said:
while you might see it as just a size issue it does end up affecting logical strength, and this restriction is IMO exactly the one of predicativism: one cannot build possibly self-referential objects, ie. objects quantified over what they themselves are
The business, roughly, is all about self-referential definitions ... do you remember Russell's paradox? Anyone?
I'm only familiar with the "informal" definition of impredicativity: self-referencing definition. E.g. .
That's exactly what all the enterprise is about here! Great!
Martin-Löf's original type theory from 1971 had a universe which contained every single type in the type theory, including itself.
Then somebody proved a type theoretic version of the Burali-Forti's paradox from the universe which contained itself, and Martin-Löf had to modify his theory so the universe doesn't contain itself.
It was later shown that if the type theory has W-types and a single universe containing every single type, then one can use the W-type for the universe to reconstruct the traditional membership-based Russell's paradox.
There is also a little other ingredient @Madeleine Birchfield , do we want to introduce some intuitionistic logic? You sort of cited it already .. don't you?
Yes, so intuitionistic logic is logic done without the excluded middle.
d.h. (= i.e.) without the rule
And could you also elaborate a little more for our audience about it? and its implications here?
How does intuitionistic logic relate to im/predicativity?
Madeleine Birchfield said:
Federica Pasqualone said:
Indeed, and what about predicativity?
Martin-Löf's type theory as usually presented is usually predicative in the sense that power sets cannot be constructed. But it's not the same kind of predicative as that found in classical mathematics. They usually accept function sets but not excluded middle, so function sets do not imply power sets.
With universes the type theory would have "power sets" defined relative to but the "power sets" themselves would still be external to so would behave more like power classes from set theory.
So with excluded middle, the set of truth values only has two elements in it , and so power sets are just function sets into . This means that predicativists who assume classical logic cannot assume that function sets exist between every single set.
However, without excluded middle, it is no longer the case that the set of truth values only has two elements. This means that power sets no longer coincide with function sets into , and predicativists who assume constructive logic can safely assume that all function sets exist, so long as they do not assume that the set of truth values exist.
The question then is: what is a function set? Lemme facilitate the discussion.
Given sets and , the function set or is the set whose elements are the functions with codomain and domain .
Well I mean, roughly speaking top and bottom there are 1 and 0, for the ones that prefer to go mod 2 :smiley: [in the presence of EM]
Madeleine Birchfield said:
Given sets and , the function set or is the set whose elements are the functions with codomain and domain .
This sounds very categorical, isn't it? :stuck_out_tongue:
In category theory those are usually called "exponential objects".
I am quite curious to see if there is any question from the audience at this point ... Is there any? - We should plan do it real life with mic and all the rest btw
Thanks @Madeleine Birchfield !
Just to be sure, a predicativist is someone who refuses self-referencing definitions. But I guess that it's ok to have self-referencing definitions as long as "it terminates". E.g., the set of binary trees.
So, is there a precise characterization of definitions that are acceptable by a predicativist?
Here it is the trick: every time you quantify over you go one step higher. That's perfectly fine! Isn't it?
Ah .. do we want really to go also for Reducibility and all the magic of types?
Anyone that has the Axiom of Infinity and the one of Reducibility at hand? [Don't be shy!]
Peva Blanchard said:
Just to be sure, a predicativist is someone who refuses self-referencing definitions. But I guess that it's ok to have self-referencing definitions as long as "it terminates". E.g., the set of binary trees.
So, is there a precise characterization of definitions that are acceptable by a predicativist?
This really depends on the predicativist and their beliefs. There are many people calling themselves predicativists who would accept anything and everything so long as the things they assume don't yield power sets. They would accept say the induction principle for the natural numbers or the induction principle for binary trees, since to them impredicativity = power sets, and neither induction principle proves that power sets exist.
Yes, maybe we should also relate the two things: namely the axioms with the natural numbers issue ... we are not speaking two languages hahha
Ok, so long story short: If one wants to carry on proofs by induction - and we love them in maths - one has also to assume the Reducibility axiom that, as said in PM, is just there ad hoc to make those things work.
It is especially hard to see how one would "move to the next level" for the induction principle of the natural numbers in some versions of Martin-Löf type theory - there is only one level, and the propositions / truth values are represented in the theory by subsingletons (i.e. types / sets where for all elements and in ) and logical operators are represented by the usual set theoretic / type theoretic operations on subsingletons. i.e. truth is any singleton, falsehood is the empty type, and universal quantification of a predicate is the Cartesian product of a family of subsingletons in Martin-Löf type theory.
And there are some predicativists who would not regard any version of Martin-Löf type theory as being "predicative" because it has induction of the natural numbers but no way to move to a different level.
This is why I say that it all depends upon the predicativist and their beliefs.
That's an amazing summary!
Because many mathematicians would regard Martin-Löf type theory, as well as other theories like CZF and CETCS, as predicative theories, even though it does come with induction of the natural numbers and no level distinction in the logic.
Please guys take note of this remark! :100: :memo:
Well .. let's leave the groupoids wonderful mysteries for the next time ....
It's almost 2am here so I have to leave, but thanks for the discussion!
Thank you for participating!
Madeleine Birchfield said:
However, without excluded middle, it is no longer the case that the set of truth values only has two elements.
Could you elaborate on this? What is the set of truth values without excluded middle? And what do you even mean by truth value in this context?
There are other versions of Martin-Löf type theory where it is possible to move ot the next level, so are more amenable to the sort of predicativity that @Federica Pasqualone is talking about. These versions of Martin-Löf type theory usually first define an infinite hierarchy of universes, and then define type operations and logical operations for the types in each universe.
The trick to replicate moving to the next level is to assume that the function sets / types and the indexed Cartesian products / dependent product types lie in the next universe level. That way, universal quantification and the induction principle of the natural numbers will lie in the next universe level. However, most type theorists working in this kind of type theory don't do this and I think the only person who has done something like this so far is @Ulrik Buchholtz and his work on primitive recursive arithmetic in homotopy type theory, where the bottom universe level doesn't contain function types / dependent product types.
Jean-Baptiste Vienney said:
Madeleine Birchfield said:
However, without excluded middle, it is no longer the case that the set of truth values only has two elements.
Could you elaborate on this? What is the set of truth values without excluded middle?
The set of truth values contains truth values, such as for any function , which cannot in general proven to be logically equivalent to or in the absence of [[excluded middle]].
But to start, what is the definition of truth value in this context? (EDIT: for me truth value means by definition "true" or "false" so I don't understand what you're talking about although it seems very interesting)
Madeleine Birchfield said:
There are other versions of Martin-Löf type theory where it is possible to move ot the next level, so are more amenable to the sort of predicativity that Federica Pasqualone is talking about. These versions of Martin-Löf type theory usually first define an infinite hierarchy of universes, and then define type operations and logical operations for the types in each universe.
The trick to replicate moving to the next level is to assume that the function sets / types and the indexed Cartesian products / dependent product types lie in the next universe level. That way, universal quantification and the induction principle of the natural numbers will lie in the next universe level. However, most type theorists working in this kind of type theory don't do this and I think the only person who has done something like this so far is Ulrik Buchholtz and his work on primitive recursive arithmetic in homotopy type theory, where the bottom universe level doesn't contain function types.
This is actually what I meant by when you quantify over you go one step higher.
Federica Pasqualone said:
This is actually what I meant by when you quantify over you go one step higher.
Except most type theorists using this kind of type theory don't actually lift universal quantification to the next universe level. Take a look at the type theory in Coq or Agda, and the universal quantification lies in the same universe as the family of subsingletons that they are quantifying over; the same goes for the textbooks on this kind of type theory like the [[HoTT book]]. And so Coq/Agda/book HoTT aren't predicative in the sense that you are using, but they are predicative in the sense that the practitioners of the theory are using (i.e. no power sets).
In general, for predicativity in the sense you are describing, one would want the universes in the type theory to correspond to the regular cardinals in set theory, even constructively, while currently in Coq/Agda/book HoTT, they correspond to the inaccessible cardinals in set theory.
I see, well ... I never play by the book :grinning:
But yeah, this is a crucial remark for people that are approaching these theories and wanted to learn more! Thanks for pointing that out!
I think I got what you mean by "it is no longer the case that the set of truth values only has two elements". Do you mean that either a proposition is provable, either its negation is provable, or none of the two are provable? So each proposition can be assigned one of these three "truth values".
Jean-Baptiste Vienney said:
I think I got what you mean by "it is no longer the case that the set of truth values only has two elements". Do you mean that either a proposition is provable, either its negation is provable, or none of the two are provable? So each proposition can be assigned one of these three "truth values".
It isn't three truth values. It is usually an infinite number of truth values. Think about how many elements are in the free Heyting algebra on a set of propositions.
But what do you mean by truth value?
Well, every proposition is either true or false, and you assign 0 or 1. But the logic is working if you have the excluded middle!
So the set of truth values is called the [[subobject classifier]] in category theory / topos theory and usually characterised by a particular universal property, and then a truth value is just an element of this subobject classifier.
In a set theory like IZF, the set of truth values is defined as the double powerset of the empty set, and a truth value is just an element of this set.
In dependent type theory, propositions / truth values are literal subsingletons, and so the set of truth values is just the set of all subsingletons, suitably defined using inference rules, or using universes and the usual type theoretic operations.
In all cases, while there is an injection from the set to the set of truth values, this is not provably a bijection unless excluded middle holds.
Can you give a bit of intuition on what the set of truth values is meaning in addition to the definition. I'm looking at the nLab page subobject classifier and I really have no intuition about what this object of truth values is doing. For instance, in IZF, how do you interpret that the set of truth values is ?
How are natural numbers defined in ZF? Think about it ...
And what is a powerset in IZF? Because with my intuition of powersets, I would like to write which is probably not what you mean.
In IZF and ZFC and other set theories like that, the power set of the empty set is a singleton, a set with exactly one element.
Oh sorry, I should have written:
So do we still have two truth values in IZF or did I do something wrong??
But without excluded middle, one cannot prove that is equal to .
Okk. So what's the meaning of here?
Federica Pasqualone said:
How are natural numbers defined in ZF? Think about it ...
The set of all subsets of . One cannot show that every subset of is equal to either or without excluded middle. So you end up having with a whole bunch of other subsets of .
In addition, given any predicate on , one gets a subset of the by the axiom of separation.
Oh, you define them with the empty set I guess. You define and then . Something like this?
Well, you simply associate 0, 1, 2, ... and so on to the recursive application of the power set operation on empty
See for instance https://en.wikipedia.org/wiki/Set-theoretic_definition_of_natural_numbers
Oh, ok. It is simpler to see it like this.
Then is exactly the set you constructed previously.
In constructive mathematics, the set or or etc still has a meaning: it is the set of all decidable truth values. A decidable truth value is a truth value which does satisfy excluded middle, i.e. .
Madeleine Birchfield said:
But without excluded middle, one cannot prove that is equal to .
But what is key is the excluded middle!
oh we wrote at the same time - sorry!
What is the link between the definition of natural numbers as and and the set of truth values?
I mean, is there something more when you go above ?
No, not really.
Ok
is the set of truth values itself, while is the powerset of the set of truth values; the set of endofunctions on the set of truth values, etc...
this doesn't really have anything to do with subsets of .
Madeleine Birchfield said:
The set of all subsets of . One cannot show that every subset of is equal to either or without excluded middle. So you end up having with a whole bunch of other subsets of .
In addition, given any predicate on , one gets a subset of the by the axiom of separation.
For instance is a truth value?
Yeah, it would be.
Also a truth value: given any predicate on , the set
is a truth value. It is also one where you can't prove it's equal to either or for general without excluded middle. It's equal to if can be proven false and it's equal to if can be proven true, but currently we are unable to prove either for general . Similarly for
If I understand the nlab page subobject classifier, given a fixed object in a topos, there is a bijective correspondence between monomorphisms of the type where can be any object and morphisms to the type . So a subset of corresponds to a morphism . It means that in IZF, a subset of a set corresponds to a function which assigns to every element of the set an element of which is this really weird set. How am I supposed to understand this? I don't get any longer what is a subset of a set if we work with IZF. Giving a subset of corresponds to saying for every element of whether this element is in the subset, not in the subset or ... where ... is a lot of things that I don't really understand for now.
Wait? Who switched the topic to topos theory here? :joy: .... ah this Heyting :heart_eyes:
A subset of a set in IZF is still the usual set-theoretic definition of a subset.
It's just that without excluded middle, one can no longer prove that or for all .
There are subsets of sets for which without excluded middle you just don't know whether an element is in the subset or not.
Ok. But given a subset of a set , you can still say that the assertion "the element x of X is in the subset A" has a truth value (if I understand well). And giving a subset is exactly giving such a truth value to every element of . I still don't get a good intuition of what it means when the truth value is neither true nor false.
What does it tell me if given some and some subset of , the truth value of is
or
given some predicate on ?
For now it looks like complete sorcery to me.
I get it somehow I think. It tells me that the assertion is true or the assertion is true I guess.
Jean-Baptiste Vienney said:
I still don't get a good intuition of what it means when the truth value is neither true nor false.
A good comparison is like the axiom of choice in ZF. The axiom of choice is neither true nor false in ZF. It is independent of the ZF axioms.
So similarly, for a predicate on , if is neither true or false in IZF, then is independent of the IZF axioms.
Let see what it gives if is . is . This one should be true.
Jean-Baptiste Vienney said:
Let see what it gives if is . is . Can we prove that it is false?
Well some predicates are provably true or false even in constructive mathematics. This is one of the examples of a provably true predicate in IZF.
What is a predicate on such that is neither provably true nor provably false?
We are now moving to Goedel 1931?
This is becoming a terrific discussion touching all the fundamentals of mathematical logic btw. Well done!
Jean-Baptiste Vienney said:
What is a predicate on such that is neither provably true nor provably false?
defined as
This one looked cool ahah
I don't think this example works though.
Let me see what means though
"x is a sequence of nonnegative integers such that:
or
What i actually want is a predicate on which comes with a sequence such that .
Madeleine Birchfield said:
Jean-Baptiste Vienney said:
What is a predicate on such that is neither provably true nor provably false?
defined as
What does work though is this:
Ok, let me write this in words:
"x is an element of such that:
More simply:
"x is the emptyset and for every sequence of integers, some term is nonzero or all the terms are zero."
Yes. The first statement is always true, so whether the element is in the subset or provably not in the subset solely depends on the truth of the second statement. If "for every sequence of integers, some term is nonzero or all the terms are zero" is provably true than is in the subset; if "for every sequence of integers, some term is nonzero or all the terms are zero" is provably false than is not in the subset.
But here we don't know if "for every sequence of integers, some term is nonzero or all the terms are zero" is true or false, without excluded middle.
Just for me: means
" is the emptyset and for every sequence of integers, some term is nonzero or all the terms are zero."
which is the same than
"for every sequence of integers, some term is nonzero or all the terms are zero"
Another example,
where is the axiom of countable choice.
Ok, so given an element of a subset of a set , saying any of these is giving some truth value to :
right?
Yes. Though strictly speaking, the latter two would be something like
and then since the first conditions, i.e. , are always true, it simplifies down to the second conditions, which are the statements you listed above.
What would be an example where really depends on ? These ones are funny but look a bit artificial as they don't really talk about .
Then we can go to this example, where my predicate is on the function set and is defined as
is true if ; is false if , but in general one can't show that for all without excluded middle.
The fact that or for all is called [[limited principle of omniscience]] for the natural numbers, and is independent of IZF but is provable in ZF with excluded middle.
Let me sum up:
I don't understand why in IZF?
What axiom of IZF let you prove this?
Because by the usual set theoretic definition of a subset and the axiom of the empty set, the only subset of the empty set is the empty set itself, and this is provable without excluded middle.
John Baez will have over a hundred comments to sort through when he reads this discussion thread tomorrow.
We did a great job!
Federica Pasqualone said:
We did a great job!
What foundations do you personally use for your mathematical work?
Madeleine Birchfield said:
Federica Pasqualone said:
We did a great job!
What foundations do you personally use for your mathematical work?
Mmh who knows? hahahah ... but between Zermelo and Russell I prefer Russell ...
More importantly i guess, do you use excluded middle in your logic or not?
What do you think?
I think you do. Most of the literature on algebraic quantum field theory uses excluded middle and I don't think any constructive mathematician has done the work to see which properties of, say, factorization algebras still hold in the absence of excluded middle.
It's a pretty big problem for constructive mathematicians in mathematics more generally. I think @Chris Heunen wrote a paper a few years ago where he synthetically defined a dagger-category satisfying certain axioms and then he used Soler's theorem to show that the dagger-category is the dagger-category of Hilbert spaces and linear maps. But no constructive mathematician has shown that Soler's theorem can be proven without excluded middle, which implies that he was using excluded middle throughout the paper.
And most constructive mathematicians aren't really interested in seeing what parts of the far flung branches of mathematics such as mathematical physics or geometric Langlands are still true without excluded middle, simply because there is a lot of preliminary work that needs to be done before constructive mathematicians are ready to tackle such topics, and it is possible that something breaks down along the way without excluded middle rendering the entire subject unattainable. For example, some parts of cohomology are impossible in constructive mathematics without some weak version of choice.
Well, there are connections between the theory of factorization algebras and algebraic quantum field theory, but they are quite different. Benini et al. compared the two theories.
I agree, without excluded middle it might get a little tricky. However, for logical investigations not having the excluded middle, as we've just seen, can be a lot of fun!
One more comment about predicativity. L.E.J. Brouwer himself was a predicativist in a strong sense, since according to Toby Bartels he apparently did not accept all function sets in his (informal) foundations, and would probably hate most modern constructive and intuitionistic predicative foundations of mathematics like CZF or MLTT because they do accept function sets:
https://golem.ph.utexas.edu/category/2008/12/the_status_of_coalgebra.html#c020731
Madeleine Birchfield said:
It's a pretty big problem for constructive mathematicians in mathematics more generally. I think Chris Heunen wrote a paper a few years ago where he synthetically defined a dagger-category satisfying certain axioms and then he used Soler's theorem to show that the dagger-category is the dagger-category of Hilbert spaces and linear maps. But no constructive mathematician has shown that Soler's theorem can be proven without excluded middle, which implies that he was using excluded middle throughout the paper.
Please correct me if you disagree, but wouldn't you say that a work such as Chris's is can be perfectly justified from a constructivist perspective to mathematical physics, where the ultimate paragon is not the relation of mathematical structures between each other, but instead the relation of mathematical structure to physical reality?
What I mean is: presumably, if I consider a certain mathematical result about mathematical structures that model physical objects to be fundamental to our understanding of physical reality, and I am a constructivist, I would like this result to be constructively provable. But then a common situation is: a result is constructively provable for certain structures , but not for certain other structures more commonly used to model the same objects.
Then, to prove that is classically (but, necessarily, not constructively) equivalent to serves the purpose of convincing classically minded mathematicians that they can "safely" work with instead, which is preferable to constructivists.
I know you have worked on the locale of real numbers, which seems to me perhaps the most prominent example of this -- there's so many "fundamental" theorems that have constructive localic versions, but violate constructive "taboos" in other versions that are classically equivalent -- I'm thinking of Hahn-Banach, Gelfand duality...
So presumably a constructivist should believe that the locale of real numbers is a better model of the physical continuum, and would be justified in using classical logic to argue that a non-constructivist doesn't "lose" anything by moving to localic models.
(I don't think Chris is a constructivist, but on the other hand, he is Dutch...)
Off-topic, but fwiw: I strongly prefer constructive (mathematical) proofs when they're available. One of my MSc supervisors Wim Veldman, sometimes called the last living intuitionist, had a strong influence there. I do like locales and constructive analysis etc, and shy away from e.g. size issues because that's not the heart of the matter to me, or why I'm interested in a certain problem. I just can't bring myself to denounce all the beautiful maths based on the axiom of choice or excluded middle. It's just a different game, and who am I to judge whether people shouldn't derive pleasure from playing it? Of course, whether and how it relates to physical reality or some platonic notion of truth is a whole different question.
Slightly more on-topic: the paper about Hilbert spaces mentioned above does rely on Soler's theorem, of which I know no constructive (or in fact intuitive-to-me proof). But for that theorem you in fact don't need Soler! @Matthew Di Meglio and I figured out how to relate categorical colimits to analytic limits of real numbers directly, which is much more satisfying. See also Matt's nice blog post on the process of how that came to be. We've been working on a paper bringing it all together, hopefully out soon enough.
Thanks everyone for this engaging discussion. When I woke up yesterday I was puzzled and even mildly upset to find no new messages on this Zulip. When I woke up today there were 158 on just this one thread!
As for mathematical physics using constructive reasoning, it will definitely be fun for constructivists to build up the framework of analysis required to do more and more mathematical physics. But to get more mathematical physicists interested, it would be helpful to come up with some examples of where constructivism might help us understand physics. E.g. some ways in which a locale of real numbers is "better" - for the purposes of physics, not logic! - than the usual set of real numbers.
This is probably a very challenging project that would require expertise in both mathematical physics and logic. It would be quite risky for anyone to spend a lot of time on this. But if someone could succeed it might spark a revolution.
I described a bunch of problems in physics connected to the continuum here:
I listed various new approaches to logic and then said:
This sort of foundational work proceeds slowly, and is deeply unfashionable. One reason is that it rarely seems to impact ‘real life’. For example, it seems that no question about the experimental consequences of physical theories has an answer that depends on whether or not we assume the continuum hypothesis or the axiom of choice.
But even if we take a hard-headed practical attitude and leave logic to the logicians, our struggles with the continuum are far from over. In fact, the infinitely divisible nature of the real line—the existence of arbitrarily small real numbers—is a serious challenge to physics.
Federica Pasqualone said:
I am quite curious to see if there is any question from the audience at this point ... Is there any? - We should plan do it real life with mic and all the rest btw
There is precedent for live-streaming math content, but my personal experience is that building up an audience is tricky and requires dedication.
John Baez said:
But to get most mathematical physicists interested, it would be helpful to come up with some examples of where constructivism might help us understand physics. E.g. some ways in which a locale of real numbers is "better" - for the purposes of physics, not logic! - than the usual set of real numbers.
This is a bit wacky and tongue in cheek, but I wonder whether some sort of paraconsistent logic would be more suitable to make sense of quantum physics, so that the cat can actually be dead and alive at the same time. In particular, dual intuitionistic logic seems to be a version of paraconsistent logic! Looks like some people have seriously thought in this direction.
I tend to get confused about constructibility, computability, and "physical realizability".
For instance, the Chaitin's constant is defined as
where is the set of (prefix-free) terminating programs (represented by a finite binary sequence) for some universal Turing machine. is a real number, usually interpreted as the probability that a random program halts. It is also an example of an algorithmically random number (Martin-Löf randomness). In particular, it is non-computable. If we believe in the physcical version Church-Turing thesis, then it also means that one cannot build a physical device that can output within an arbitrary tolerance range.
Does it mean that the definition of is also non-constructive?
Madeleine Birchfield said:
There are subsets of sets for which without excluded middle you just don't know whether an element is in the subset or not.
An alternative interpretation which is literally valid in some contexts is that an element can be a member of a subset "to some extent"; it need not be entirely present or entirely absent from the subset.
Madeleine Birchfield said:
It isn't three truth values. It is usually an infinite number of truth values. Think about how many elements are in the free Heyting algebra on a set of propositions.
This is interesting, but there's something I wanted to bring up. When I was first learning logic, I made a big mistake by conflating intuitionistic logic with "many-valued logic". Fortunately, I quickly learned that the two are not the same nor do they converge in any sense. This is because "many-value logic" does still have the law of excluded middle, it just is "expanded" to include the other truth values. For instance, a 3-value logic with truth values A, B, and C still has the law of excluded middle, which states that a proposition must be A, B, or C, and nothing in between. The axioms defining a Boolean topos (where classical logic with LEM holds) seem to be independent of the question of how many arrows there are from the terminal object to the subobject classifier (which quantifies how many truth values your system has), which means most Boolean toposes actually won't be two valued! I'm not sure if this independence holds in the other direction, that is, if it's possible to have a non-Boolean topos with a finite number of morphisms from the terminal to subobject classifier.
Peva Blanchard said:
Does it mean that the definition of is also non-constructive?
A definition can only really fail to be constructive if it relies on a non-constructive result to be well-defined, and indeed this particular definition relies on LEM to say that "for all programs , either or ".
Tobias Fritz said:
This is a bit wacky and tongue in cheek, but I wonder whether some sort of paraconsistent logic would be more suitable to make sense of quantum physics, so that the cat can actually be dead and alive at the same time. In particular, dual intuitionistic logic seems to be a version of paraconsistent logic! Looks like some people have seriously thought in this direction.
This is interesting to bring up because I was just learning about logic and quantum mechanics. The way I see it, you can do quantum mechanical logic with the usual intuitionistic logic. Think about the proposition "the particle is here at x = 1". In classical logic, this proposition is either true (and the particle is there) or the proposition is false (and the particle is elsewhere). This only makes sense if you've already performed a measurement to see if the particle was at x=1, and only in the moment right after the measurement was conducted. Without any measurement to cause the particle to behave more classically, you can't say whether the proposition "the particle is here at x=1" is true or false, or has any other truth value associated to it (since, at least in the usual interpretation of QM, it doesn't make sense to ask if a non-measured particle even "has" a well defined notion of position at all!) Therefore, LEM does not apply and you are working in intuitionistic logic!
What I was reading about recently were "Bohr toposes". The internal logic of a Bohr topos is a "quantum logic". While quantum logic was invented before topos theory, the validity of the original conception is debated and so it wasn't until topos theory that the notion of "quantum logic" could be put on more rigorous founding. Since Bohr toposes need not be Boolean, this means that quantum logic is indeed inherently intuitionistic just as I described above!
John Baez said:
As for mathematical physics using constructive reasoning, it will definitely be fun for constructivists to build up the framework of analysis required to do more and more mathematical physics. But to get more mathematical physicists interested, it would be helpful to come up with some examples of where constructivism might help us understand physics. E.g. some ways in which a locale of real numbers is "better" - for the purposes of physics, not logic! - than the usual set of real numbers.
Couldn't one simply apply the "principle of parsimony" to principles of reasoning as well? If there is
then one should prefer as a model?
Parsimony seems to be well-accepted as a way of ranking mathematical models of the physical world, so unless one is philosophically committed to consider logic as "fixed and pre-existing", then I do not see why it should not be counted as a part of the model?
John Onstead said:
What I was reading about recently were "Bohr toposes". The internal logic of a Bohr topos is a "quantum logic". While quantum logic was invented before topos theory, the validity of the original conception is debated and so it wasn't until topos theory that the notion of "quantum logic" could be put on more rigorous founding. Since Bohr toposes need not be Boolean, this means that quantum logic is indeed inherently intuitionistic just as I described above!
That's a reasonable perspective and certainly worth exploring. But on the other hand, it seems doubtful to me that this approach most accurately reflects the physics, and I'm uncomfortable with calling it a "quantum logic". (And I'd expect the same objection to apply to a paraconsistent or dual intuitionistic approach.)
Here's why: if you take a 'quantum proposition' to mean a -valued quantum observable, then this is mathematically modelled by a projection operator on the Hilbert space of wave functions, or equivalently by a (closed) subspace of that Hilbert space. Then these propositions form an orthomodular lattice. Notably, this propositional quantum logic then relaxes classical logic, as intuitionistic logic does as well, but it relaxes it in importantly different ways:
That's why the intuitionistic logic of a Bohr topos doesn't look like a "quantum logic" to me. As far as I understand, the mapping from quantum propositions to Bohr topos propositions fails to be a lattice homomorphism because the Bohr topos has additional propositions with unclear physical meaning. Am I getting this right? Perhaps you could object that the conception of quantum propositions as -valued observables is too narrow and these extra propositions do have physical significance, but then I'd like to see a detailed account of what that significance is.
BTW I'm not trying to advocate for traditional quantum logic here; it seems to have been largely a failure so far, given that physicists aren't using it to reason about quantum systems, arguably because it's too clunky to be useful especially for quantitative things. And so far I don't see the Bohr topos approach being more useful for physics. I'd be glad to see evidence to the contrary.
Amar Hadzihasanovic said:
Parsimony seems to be well-accepted as a way of ranking mathematical models of the physical world, so unless one is philosophically committed to consider logic as "fixed and pre-existing", then I do not see why it should not be counted as a part of the model?
In mathematical physics, logical parsimony is much less highly valued than being able to predict a physical phenomenon someone else can't, or do a calculation or prove a theorem with implications for physics that someone else can't. In fact the only sort of parsimony I've seen mathematical physicists care about is parsimony in explicitly physical assumptions: e.g. people prefer a black hole singularity theorem if it makes weaker assumptions on the stress-energy tensor of the collapsing matter. I've never seen a mathematical physicist advertise that they can prove something without using excluded middle. (If they did, they'd probably be talking only to logicians.)
For what is described as an approach "rather complementary to Bohr toposes", you might enjoy Schreiber and Sati's Quantum Monadology, in particular the section 'Literature 1.2 (Epistemology of quantum physics and its formalization)'
But surely if, say (sorry for the "absurd" example, I can't think of a better one), it had turned out that no experimental fact about quantum mechanics relied on the existence of an inner product on Hilbert spaces, we would all be dropping it and using "plain" vector spaces in our mathematical foundations of quantum mechanics.
Why should the same status not be given to logical axioms?
It's probably true that if one could formulate Schrodinger's equation for the hydrogen atom and prove it gives the right spectrum without using excluded middle, one could get some attention.
It would not excite me very much, because I feel pretty confident that one could do this, in part bey avoiding reliance on general theorems like the spectral theorem, and focusing on the specific example, where everything should be constructive.
But still it might be worth doing.
I would be more interested in some more result that didn't concern an 'exactly solvable' model.
In Recursivity in quantum mechanics I proved that Schrodinger's equation for a collection of charged particles can be solved in a computable way, but that wasn't using constructive logic.
Morgan Rogers (he/him) said:
Peva Blanchard said:
Does it mean that the definition of is also non-constructive?
A definition can only really fail to be constructive if it relies on a non-constructive result to be well-defined, and indeed this particular definition relies on LEM to say that "for all programs , either or ".
The sum is over programs that halt - where does LEM com in? Maybe we need it to prove the number exists, but I don't see it - I'm not very good at constructivity.
Peva Blanchard said:
Chaitin's constant is defined as
where is the set of (prefix-free) terminating programs (represented by a finite binary sequence) for some universal Turing machine.
(Sorry if this was mentioned somewhere in the previous 100 messages) - the basic difficulty with doing anything in calculus constructively is that the Bolzano-Weierstrass theorem (every bounded sequence has a convergent subsequence) requires both excluded middle and countable choice to prove, and most of the main theorems of real analysis rely on it. The combination of EM + countable choice can be constructivised using techniques like dialectica-type translations, but it's extremely gnarly
If I remember correctly, the Bolzano-Weierstrass theorem only requires a weak version of excluded middle called the [[analytic LPO]].
Personally for me, I'm currently looking at weak versions of excluded middle, such as
which would make the real numbers and the function sets and behave more as they do in classical mathematics.
Morgan Rogers (he/him) said:
Federica Pasqualone said:
I am quite curious to see if there is any question from the audience at this point ... Is there any? - We should plan do it real life with mic and all the rest btw
There is precedent for live-streaming math content, but my personal experience is that building up an audience is tricky and requires dedication.
@Morgan Rogers (he/him) ... I was born for the stage ... :smirk: :mic:
John Onstead said:
Madeleine Birchfield said:
It isn't three truth values. It is usually an infinite number of truth values. Think about how many elements are in the free Heyting algebra on a set of propositions.
This is interesting, but there's something I wanted to bring up. When I was first learning logic, I made a big mistake by conflating intuitionistic logic with "many-valued logic". Fortunately, I quickly learned that the two are not the same nor do they converge in any sense. This is because "many-value logic" does still have the law of excluded middle, it just is "expanded" to include the other truth values. For instance, a 3-value logic with truth values A, B, and C still has the law of excluded middle, which states that a proposition must be A, B, or C, and nothing in between. The axioms defining a Boolean topos (where classical logic with LEM holds) seem to be independent of the question of how many arrows there are from the terminal object to the subobject classifier (which quantifies how many truth values your system has), which means most Boolean toposes actually won't be two valued! I'm not sure if this independence holds in the other direction, that is, if it's possible to have a non-Boolean topos with a finite number of morphisms from the terminal to subobject classifier.
That's a very interesting remark! Thanks a lot for it! :memo:
To get the true effect of classical mathematics for and I'd probably have to pair
with something like
John Baez said:
Thanks everyone for this engaging discussion. When I woke up yesterday I was puzzled and even mildly upset to find no new messages on this Zulip. When I woke up today there were 158 on just this one thread!
And still counting! :zany_face:
So .. now we are on pointless topology? :heart_eyes:
Then, for the ones interested, I would suggest reading "Frames and Locales" [Picado, Pultr] and "Categorical Foundations Special Topics in Order, Topology, Algebra, and Sheaf Theory" - Chapter 2 - Picado, Pultr and my BSc supervisor Anna Tozzi on locales.
Amar Hadzihasanovic said:
Madeleine Birchfield said:
It's a pretty big problem for constructive mathematicians in mathematics more generally. I think Chris Heunen wrote a paper a few years ago where he synthetically defined a dagger-category satisfying certain axioms and then he used Soler's theorem to show that the dagger-category is the dagger-category of Hilbert spaces and linear maps. But no constructive mathematician has shown that Soler's theorem can be proven without excluded middle, which implies that he was using excluded middle throughout the paper.
Please correct me if you disagree, but wouldn't you say that a work such as Chris's is can be perfectly justified from a constructivist perspective to mathematical physics, where the ultimate paragon is not the relation of mathematical structures between each other, but instead the relation of mathematical structure to physical reality?
What I mean is: presumably, if I consider a certain mathematical result about mathematical structures that model physical objects to be fundamental to our understanding of physical reality, and I am a constructivist, I would like this result to be constructively provable. But then a common situation is: a result is constructively provable for certain structures , but not for certain other structures more commonly used to model the same objects.
Then, to prove that is classically (but, necessarily, not constructively) equivalent to serves the purpose of convincing classically minded mathematicians that they can "safely" work with instead, which is preferable to constructivists.
And @Amar Hadzihasanovic , "structures" what a tempting plural ... do we want to talk about Alexander :heart: (Grothendieck) for a moment? Anyone?
It is well known in constructive analysis that if you want Georg Cantor's definition of the real numbers to behave like it does in classical mathematics, then you would need
Similarly, if you want Richard Dedekind's definition of the real numbers to behave like it does in classical mathematics, then you would need
This is necessary but not sufficient for mathematical physics - one also would want to study say subsets of the function set such as the smooth functions or the analytic functions , or sequence spaces or other infinite-dimensional Hilbert spaces, which require that the locale associated with be spatial and the tight apartness relation on its space of points be decidable in order for it to behave classically.
Then one may also consider operator algebras on such Hilbert spaces, which are usually subsets of the function set if I recall correctly. And for those to behave like they do classically, then there needs to be similar axioms stating that the locale is spatial and its space of points has decidable tight apartness.
Now I suppose constructivists can work instead with localic vector spaces such as the localic versions of Hilbert spaces or Fréchet spaces instead of the usual definition of Hilbert spaces or Fréchet spaces as topological vector spaces, but I'm not seen much literature in that direction yet.
I guess one problem, partly already addressed by @John Baez , is that people working on mathematical physics are not usually mastering logic and also the audience of the journals are different, that's why it is a challenging project to rewrite physics logically or to do logical physics! :sparkles:
Many parts of mathematical physics, such as statistical mechanics and quantum mechanics, also rely on probability and statistics, and I don't know if any constructive mathematicians have developed a constructive theory of probability and statistics yet. The problem with the classical formalization of probability theory in terms of measurable spaces is that measurable spaces are not well behaved in the absence of excluded middle. Most of the literature on constructively more well-behaved alternatives to measurable spaces as a foundation for probability and statistics, such as measurable locales or the various categorical approaches, still assume excluded middle and dependent choice.
Federica Pasqualone said:
I guess one problem, partly already addressed by John Baez , is that people working on mathematical physics are not usually mastering logic and also the audience of the journals are different, that's why it is a challenging project to rewrite physics logically or to do logical physics! :sparkles:
Another problem is that mathematicians have no formalisation of the path integral as used in quantum field theory, so until that happens, the path integral will not be amenable to constructive approaches.
The path integral, and the infinities in quantum field theory, are something mathematical physicists care about a lot - so if any new approach to mathematics really helps, they will be interested. But this seems to be a hard problem.
There have been various attempts to tame the infinities in quantum field theory using nonstandard analysis, for example, but I've never been convinced that they accomplish something useful for mathematical physics.
This is an interesting challenge ... :sunglasses:
I can give you extremely difficult challenges all day long for free. You have to pay extra to get problems that I have some idea how solve. :upside_down:
Now I am getting curious ... hahah
But yeah, when I was in Goettingen I got interested in nonstandard analysis. So if not this way, which way? ...
@Jules Hedges - when it comes to dealing with things like Bolzano-Weierstrass constructively, I believe for physics and other practical applications of analysis we don't need to prove every bounded sequence has a convergent subsequence. I believe we only need to prove that "sufficiently nice" bounded sequences have convergent subsequence. That is, I think we should allow more hypotheses in the theorem: hypotheses that we can check in practical applications.
I haven't checked that we can prove interesting results about mathematical physics constructively, so this line of research is another difficult challenge - but actually I think it's easier than my last challenge.
The challenge here is to develop enough analysis constructively to prove interesting results about specific physical systems. The idea would be to sidestep extremely general results like "every Hilbert space has an orthonormal basis", which are non-constructive, and take advantage of what we know about specific examples.
If any constructive mathematicians are looking for an extremely difficult challenge in mathematical physics they can try to prove the extended TQFT hypothesis without excluded middle.
How about trying something easier first?
To each open issue here we should attach as tag the level of complexity :grinning_face_with_smiling_eyes:
and Alexander teaches us that a good theory is built from the ground up and nothing should be left to chance ...
I didn’t see anybody mention, in the discussion of intutionistic truth values above, that while it’s not true that every truth value is either or $$\bottom$$ , it’s also not true that there are any other specific truth values you can name! I don’t remember the details of this well but I think the late David McCarty (who I guess was the second to last living intuitionist) once proved to me that the set of truth values is not finite, even though it has no specific exotic values.
That's a great remark! Thanks for sharing! :memo:
@Kevin Carlson The term "truth value" is in itself a bit misleading here. In the context of the discussion above, "truth value" refers to an element of a subobject classifier in some category. You can construct a topos whose internal logic is classical (LEM holds) even though you can name 4 different truth values. However, if there are exactly 2 truth values in the sense that , the logic of your topos is necessarily classical (LEM holds).
Abstracting away from categories, the very notion of "-valued logic" has some subtlety to it.
For instance, depending on the specific definition, intuitionistic logic might be 2-valued, or might not qualify as -valued for any - see my answer to a related question over at Math.SE.
I usually find topos theory helpful in semantically grounding such discussions. The category of functors for a monoid (or one-object category) is a topos with two "truth values" , because there are only two subobjects of , but the logic is nevertheless intuitionistic.
It probably has two "truth values" externally but you likely cannot prove it in the internal logic.
@Madeleine Birchfield The question is precisely what it means for a logic to have truth values i.e. when you say that "having two truth values is not provable in the internal logic", what is the internal statement that you're claiming is not provable. There is no widely-agreed-upon standard for that. (It's of course much easier to agree upon what it means for a semantics to have truth values; topoi, in this sense, are a possible semantics for intuitionistic logic. But that does not really say much about the logic itself: in this case, there are two-valued topoi whose internal logic nonclassical, and 65535-valued topoi whose internal logic is fully classical)
It's weird, I've been using "excluded middle" as internal excluded middle; i.e. sets form a Boolean topos.
It is perfectly possible to define a classical set theory using intuitionistic logic - assuming the axiom of choice in IZF does this. The external logic itself is still intuitionistic in the sense that full excluded middle can't be proven for all propositions in the theory, but excluded middle holds in the internal logic (i.e. in the double powerset of the empty set).
The thing is that I've been using dependent type theory for the past few years and unlike set theory, it doesn't have an external logic. So excluded middle there is literally "every subsingleton is either a singleton or equivalent to the empty set".
Zoltan A. Kocsis (Z.A.K.) said:
But that does not really say much about the logic itself: in this case, there are two-valued topoi whose internal logic nonclassical, and 65535-valued topoi whose internal logic is fully classical)
Ah, this seems to confirm what I was talking about above, that the "many-valued-ness" of a topos and whether or not it is Boolean (classical) are independent matters. To be honest I am a little surprised one can have a 2-valued non-Boolean topos though! Very interesting!
Madeleine Birchfield said:
Another problem is that mathematicians have no formalisation of the path integral as used in quantum field theory, so until that happens, the path integral will not be amenable to constructive approaches.
This might be an intriguing discussion all on its own, but I want to know more about what this means. Is this statement related to how apparently QFT is not "formalized" which necessitates things like AQFT and FQFT? If so I've never been able to see why this is the case. The path integral is a well formalized concept; every student learns about line integrals (also called path integrals) in first year multivariable calculus.
I also see ways of formalizing QFT itself that are immediately apparent. One way is to use the Schrodinger picture and just axiomatize QFT by stating it takes place in an infinite dimensional Hilbert Space where every vector is a quantum state for the quantum field. The quantum state can then be decomposed into eigenstates along the "field configuration" basis, allowing you to write the quantum state as a superposition of all possible field configurations. This leads to what is known as the "wavefunctional", a functional that takes in a possible field configuration and returns the probability amplitude you will find that configuration when you take a measurement of the field. I don't see any problem with this axiomatization, especially because it is basically exactly the same to how traditional QM is axiomatized- there's a Hilbert space, physical property observables are Hermitian operators on this Hilbert space, quantum objects are described by vectors in Hilbert space, etc.
John Onstead said:
The path integral is a well formalized concept; every student learns about line integrals (also called path integrals) in first year multivariable calculus.
The path integral in quantum field theory is specifically talking about the Feynman path integral, rather than the line integral found in calculus:
https://en.wikipedia.org/wiki/Path_integral_formulation
Yes, @John Onstead - we're not talking about line integrals: we're talking about an integral over an infinite-dimensional space of paths, or some other infinite-dimensional space of maps. This is what show up in quantum field theory, and it's difficult to formalize except in limited cases.
John Baez said:
Yes, John Onstead - we're not talking about line integrals: we're talking about an integral over an infinite-dimensional space of paths, or some other infinite-dimensional space of maps. This is what show up in quantum field theory, and it's difficult to formalize except in limited cases.
Well if you Wick rotate into statistical physics and discretize into lattice (gauge) theory then it is “just” a finite but large Markov chain
How much of the experiments and the analysis at the LHC or Fermilab or other particle physics laboratories actually use the Standard Model itself, as opposed to some discrete lattice gauge theory approximation of the Standard Model?
@Madeleine Birchfield I'm not sure I think that's a completely well-defined question given the way physicists think about physics, but if I recall correctly a large amount of hadron collision physics is done with semi-empirical formalisms like parton distribution functions, which aren't derived from lattice computations or lower-level principles like the standard model. That said other parts of various calculations are done using the standard model and perturbation theory.
Good evening everyone, wait ... now we are going for the Standard Model and beyond? :heart:
This thread is getting seriously dangerous hahahha [in the good sense] very Federica!
Ah I see it's the Feynman path integral formulation, my bad. I still don't see why this is hard to formalize- it's just an integral, even if the underlying space is complicated. Let's say we have standard 2D Euclidean space R^2 as well as the Euclidean interval [0,1]. Both of these are objects in most "nice" categories of spaces, which are cartesian closed. Thus, we may define an internal hom Hom([0,1], R^2) to be the space of all functions from [0,1] to R^2, which we can imagine as paths in R^2. This is just a space as any other. We can then define a function (functional) on this space e^(i S): Hom([0,1], R^2) -> C which sends a path to a complex number. If we choose two points in R^2 (representing initial and final states), we can then define a subspace of Hom([0,1], R^2) on all paths between these two points. Now we have our function to integrate as well as the domain of integration, so we can properly take the integral. At which point in this process does the formalization break down?
The immediate problem is that the domain of integration is infinite-dimensional.
In general, a path integral is given by some good measure and an action integrated over all possible paths. The action functional being the integral of a Lagrangian. How do we know this is a well-defined object? How do we treat divergences? and what about the domain of integration?
It's an infinities issue! :exploding_head:
In classical mechanics one gets trajectories of the system under investigation easily. I mean, one finds out potential and kinetic energies to write down the Lagrangian, integrate over time to have the action and then Hamilton's principle. Solutions of Euler-Lagrange eq.s are equivalent to the stationary points of the action functional. Here you are your trajectory ... and the system evolves along it.
When you switch to the quantum world things get a little tricky, it is all about calculus of probability over an infinite number of possible trajectories.
Madeleine Birchfield said:
How much of the experiments and the analysis at the LHC or Fermilab or other particle physics laboratories actually use the Standard Model itself, as opposed to some discrete lattice gauge theory approximation of the Standard Model?
That's not really the kind of question that has a simple answer. For example, lattice models are especially important for studying low-energy quantum chromodynamics, like computing the mass ratio of the proton and neutron. But people working on this don't simply discretize QCD and put it on a lattice - the size of a lattice that people can compute with is much too small to get accurate results that way. So, elaborate methods based on the continuum are used to try to correct the errors due to working on a lattice.
Furthermore lattice computations are always done in Euclidean space, not Minkowski spacetime, as JR pointed out - this essentially amounts to replacing time with imaginary time , which converts quantum field theory problems to statistical mechanics problems. So they need to be analytically continued back to get physically significant results, and the details of how this is done depends on what one is trying to compute.
So, please don't think people doing lattice gauge theory simply replace spacetime by a lattice, discretize quantum field theory, and do computations on that lattice.
@John Onstead wrote:
Now we have our function to integrate as well as the domain of integration, so we can properly take the integral. At which point in this process does the formalization break down?
Right where you said "so we can properly take the integral". You can't do an integral using just the domain of integration and the function to be integrated! You need a measure, or perhaps some generalization of a measure. That's where all the work lies.
You should read the answers here to get a rough sense of where we stand:
One of $1,000,000 Millennium prizes involves making the path integral rigorous enough so that we can prove some things we expect are true.
It's a lot of money @John Baez ... :rolling_on_the_floor_laughing: :money:
Now everyone will start studying MT hahahhahahah
and logic of course hahhahah
Silly thought: maybe the Standard Model should actually be considered a billion-parameter model because it needs a physicist’s brain to make any predictions; and maybe the “theory of everything” will just be GPT-37 doing integrals like a physicist. :woman_scientist::robot:
I don't know ... AI ... mmh I am the kind of person who loves paper, her fountain pen and studies calligraphy! :feather:
And @Benedikt Peterseim , picture this: Being the one who formalized Feynman integrals!
If you are talking about just calculations afterwards, this is another story.
The "Theory of Everything" ... :thought:
Anyone on multiverse theory? :rolling_on_the_floor_laughing:
Please don’t take seriously what I wrote in case that wasn’t clear. :D
The slightly serious point, though, is that the “model” part in “Standard Model” is somewhat a misnomer I think, since it’s more of a framework that particle physicists use to derive various particular models. “Standard Theory of Particle Physics” would be better perhaps. After all, it’s also not called the “Classical Mechanics Model”, classical mechanics has many models and so does particle physics.
Actually, thanks for bringing up "The theory of everything"! I am assuming you are not referring to the movie! :stuck_out_tongue:
ahhh ... the good old days of David Griffiths' book "Introduction to elementary particles"!
You're right! It is a model! and Feynman did a great job there too with his diagrams!
Federica Pasqualone said:
Actually, thanks for bringing up "The theory of everything" I am assuming you are not referring to the movie! :stuck_out_tongue:
No, the quotation marks are just there because I don’t even know what “theory of everything” might actually mean.
Federica Pasqualone said:
Good evening everyone, wait ... now we are going for the Standard Model and beyond? :heart:
That's it! Beyond the SM! The great unification, the dream of a unique theory that encompasses everything!
But I should also mention, given the appropriate nature of this thread hahah, that there are logical arguments against it too! (besides others of various nature)
Yes, here we are again: LOGIC :heart_eyes:
Federica Pasqualone said:
But I should also mention, given the appropriate nature of this thread hahah, that there are logical arguments against it!
Assuming this is referring to the standard model, I have some doubts about the future about the standard model that I would like to share. If this happens to be about beyond standard model physics, well, the possibility of any beyond standard model physics will also likely get negatively affected by these doubts. Let me copy what I wrote in another thread over here:
Madeleine Birchfield said:
More importantly, a significant part of science is the ability to replicate experiments to demonstrate the validity of scientific results for contemporary society. One reason we know that general relativity is currently scientifically true in various domains is because there are currently people replicating experiments and observations in their labs first done in the 1910s and 1920s and the results have all been in accordance with general relativity.
However, much of the evidence for the Higgs boson and other parts of the standard model has come from various particle colliders such as the LEP and its replacement the LHC.
Now, there is currently debate over whether there will be funding for a replacement particle collider when the LHC stops performing experiments and gets decommissioned in the future. If no replacement particle collider ever gets built due to lack of funding, eventually a new generation of physicists will grow up not being able to replicate the experiments at the LEP and the LHC, won't be able to detect the Higgs boson, and won't have anybody around who still remembers when scientists at CERN did perform experiments to detect the Higgs.
In this case, I highly doubt that "the Higgs boson exists today" remains a scientific truth to those future physicists, because all that the physicists will have is the fact that "the LHC detected the Higgs in 2012" and they no longer have any way of verifying whether the experiments at the LHC are still true in their time, or whether something has changed in the years between LHC and their contemporary society potentially yielding different results for the experiments, thus requiring physicists to go beyond the standard model. While "the Higgs boson existed in 2012" is verifiable fact, "the Higgs boson exists today" becomes just another hypothesis in natural philosophy which is beyond scientific experiment.
Madeleine Birchfield said:
I think that as governments and universities and research institutes gradually cut funding to particle physics experiments and redirect them elsewhere, i.e. to astrophysics, condensed matter, etc, a lot of standard model physics will in the future become like what quantum gravity is today, still being studied as theory by physicists in physics departments but with no contemporary connections to scientific experiment.
Madeleine Birchfield said:
Trying to use the results of the LHC in the 2010s and 2020s to justify the truth of the standard model after the LHC has been shut down runs straight into the problem of induction. What the proponents of a new particle collider are trying to do is to set up a replacement for the LHC so that they can circumvent the problem of induction by directly testing for the Higgs and etc using the scientific method, so that the standard model can avoid becoming the next quantum gravity / string theory.
Madeleine Birchfield said:
Federica Pasqualone said:
But I should also mention, given the appropriate nature of this thread hahah, that there are logical arguments against it!
Assuming this is referring to the standard model, I have some doubts about the future about the standard model that I would like to share. If this happens to be about beyond standard model physics, well, the possibility of any beyond standard model physics will also likely get negatively affected by these doubts. Let me copy what I wrote in another thread over here:
Madeleine Birchfield said:
More importantly, a significant part of science is the ability to replicate experiments to demonstrate the validity of scientific results for contemporary society. One reason we know that general relativity is currently scientifically true in various domains is because there are currently people replicating experiments and observations in their labs first done in the 1910s and 1920s and the results have all been in accordance with general relativity.
However, much of the evidence for the Higgs boson and other parts of the standard model has come from various particle colliders such as the LEP and its replacement the LHC.
Now, there is currently debate over whether there will be funding for a replacement particle collider when the LHC stops performing experiments and gets decommissioned in the future. If no replacement particle collider ever gets built due to lack of funding, eventually a new generation of physicists will grow up not being able to replicate the experiments at the LEP and the LHC, won't be able to detect the Higgs boson, and won't have anybody around who still remembers when scientists at CERN did perform experiments to detect the Higgs.
In this case, I highly doubt that "the Higgs boson exists today" remains a scientific truth to those future physicists, because all that the physicists will have is the fact that "the LHC detected the Higgs in 2012" and they no longer have any way of verifying whether the experiments at the LHC are still true in their time, or whether something has changed in the years between LHC and their contemporary society potentially yielding different results for the experiments, thus requiring physicists to go beyond the standard model. While "the Higgs boson existed in 2012" is verifiable fact, "the Higgs boson exists today" becomes just another hypothesis in natural philosophy which is beyond scientific experiment.
Madeleine Birchfield said:
I think that as governments and universities and research institutes gradually cut funding to particle physics experiments and redirect them elsewhere, i.e. to astrophysics, condensed matter, etc, a lot of standard model physics will in the future become like what quantum gravity is today, still being studied as theory by physicists in physics departments but with no contemporary connections to scientific experiment.
Madeleine Birchfield said:
Trying to use the results of the LHC in the 2010s and 2020s to justify the truth of the standard model after the LHC has been shut down runs straight into the problem of induction. What the proponents of a new particle collider are trying to do is to set up a replacement for the LHC so that they can circumvent the problem of induction by directly testing for the Higgs and etc using the scientific method, so that the standard model can avoid becoming the next quantum gravity / string theory.
I was referring to the "theory of everything".
Federica Pasqualone said:
I was referring to the "theory of everything".
Yeah, well my points still hold even more strongly for any "theory of everything". Unlike the standard model, where the lack of access to scientific experiments confirming the standard model is currently something that isn't the case now but could possibly happen in the future, the lack of access to scientific experiments confirming the various theories of everything in theoretical physics is a reality today.
Why should anybody believe in a "theory of everything" if there is no scientific evidence that gravity is united with the rest of the forces?
Well, but even before thinking of switching to the experimental side of the story, having a formal theory of everything - meaning an axiomatics for describing every fundamental force in the universe - is an issue.
The "theory of everything" isn't actually meant to be a theory of everything in the universe, it was merely a term used by particle physicists to describe the unification of the 3 quantum forces (electromagnetism, strong, and weak forces) with gravity into one unified field theory.
Even if the particle physicists had managed to succeeded in their quest for the theory of everything, it wouldn't literally be a theory of everything, since there are things, like consciousness, which remain inexplicable using the unified field theory.
There is also the risk that dark matter exists and is mediated by some new fifth force. Dark matter is already outside the original definition of the "theory of everything" since the theorists were just focusing on unifying the three forces of the standard model with gravity.
Federica Pasqualone said:
But I should also mention, given the appropriate nature of this thread hahah, that there are logical arguments against it too! (besides others of various nature)
Yes, here we are again: LOGIC :heart_eyes:
Yes, there are indeed a bunch of arguments against it.
There are also other arguments against the standard model.
In the standard model, neutrinos do not oscillate. But we know from scientific experiment and observations that neutrinos do oscillate. So we have to go beyond the standard model and add a mechanism that would allow neutrinos to oscillate. Most proposals add neutrino masses, while some others break Lorentz symmetry.
So if I wager a guess as to the future of particle physics and the standard model - the large particle colliders like the LHC are likely going to be a thing of the past as funding for them dries up. Physicists will still believe in the Higgs boson even though they can no longer demonstrate its existence through repeated scientific experiment, and it will harden into dogma that gets passed down to generations of new physicists who cannot replicate the experiments. (This was one reason why I was saying in the other thread that physics is first and foremost a branch of "natural philosophy" and only secondarily a science.)
Meanwhile, the astrophysics community will likely turn decisively against the dark matter hypothesis as searches for dark matter turns up empty, and evidence accumulates that some relativistic MOND theory is actually favoured over dark matter to explain astrophysical observations in galaxies and galaxy clusters and in cosmology, meaning that funding for particle dark matter experiments will also end up being a thing of the past.
Supersymmetry is already virtually dead today.
In the end, neutrinos will be the only potential thing that could lead to beyond standard model physics, so most of the remaining energy and resources in particle physics will get poured into neutrino experiments and trying to sift through the different theories of neutrino oscillations that extend the standard model.
Neutrino physics and dark matter! These are the main areas in which they do experiments at the LNGS too in my hometown.
Yeah I know very well the story of the neutrinos :wink:
and I guess, for the audience, that could be a bit far from physics ( but hopefully got interested in it) in terms of education, David Griffiths' book "Introduction to Elementary Particles" is the best book I know to start diving into the subject! :books:
Time to study guys! :nerd: :rolling_on_the_floor_laughing:
Since we're here at the frontiers of physics, I'm really tempted to go into another interpretation of "One universe as a foundation & friends" and talk about our current theories of physical cosmology.
This thread is better than HTT :upside_down: speaking of :infinity:
Would you mind waiting a little time? Maybe tomorrow, so that people are able to read first about the SM?
Sure.
Great then, as tradition wants: Any questions? :mic:
Madeleine Birchfield said:
Why should anybody believe in a "theory of everything" if there is no scientific evidence that gravity is united with the rest of the forces?
A theory of gravity + quantum (not necessarily quantum gravity, which deals specifically with introducing gravitons, but rather any theory that explains both gravity and quantum mechanics at once) is needed simply because there are domains and conditions in the universe that bring the two together. For instance, you would need a unified description to explain the gravitational force generated by an object in superposition. Let's say you had a ball capable of entering superposition, and you sent it through a beamsplitter-like apparatus that splits its matter wave. You now have the ball in a superposition of being in two very different, and very far apart locations. Since all matter generates gravity, where is this gravity getting generated exactly? If the particle doesn't even have a well defined position in space- in fact, a position in space at all- then how can it curve spacetime?
The more usual example is that general relativity breaks down in certain places like a black hole singularity, which necessitates a theory that has both quantum and gravity in it. There's also the black hole information paradox which deals with the intersection between these two theories. Because these two theories intersect in nature, there must be some law governing how they intersect, which will be this theory of gravity+quantum you call a "theory of everything".
John Onstead said:
A theory of gravity + quantum (not necessarily quantum gravity, which deals specifically with introducing gravitons, but rather any theory that explains both gravity and quantum mechanics at once) is needed simply because there are domains and conditions in the universe that bring the two together.
A theory that explains both gravity and quantum mechanics at once is not the same as a theory which says that there is one single unified force which at low energies separates into electromagnetism, the strong force, the weak force, and gravity. The latter is what particle physicists mean by uniting gravity with the other 3 forces, by the "theory of everything".
Madeleine Birchfield said:
Physicists will still believe in the Higgs boson even though they can no longer demonstrate its existence through repeated scientific experiment, and it will harden into dogma that gets passed down to generations of new physicists who cannot replicate the experiments.
I don't understand this position. Just because experiments stop confirming the higgs boson doesn't suddenly mean the higgs boson would cease to exist? That is, once we demonstrate its existence to a high degree of significance, why would we need to keep demonstrating its existence? The LHC is no longer running to establish the existence of the higgs, that's already done. It's now focused on either discovering new particles, or discovering more about the particles we already know exist (like the higgs). Though I would agree with your skepticism that larger colliders will be worth it since unlike with the higgs and LHC we have no solid evidence that there even might be more massive particles to make such a thing worthwhile.
Madeleine Birchfield said:
Meanwhile, the astrophysics community will likely turn decisively against the dark matter hypothesis as searches for dark matter turns up empty, and evidence accumulates that some relativistic MOND theory is actually favoured over dark matter to explain astrophysical observations in galaxies and galaxy clusters and in cosmology
I disagree. No theory of MOND is capable of explaining the CMB or why some pairs of galaxies with the same mass and mass distribution have wildly varying stellar rotation curves. To explain the latter any such theory would have to either be internally inconsistent or vary across the universe like a scalar field, which would then just be dark matter again.
John Onstead said:
I don't understand this position. Just because experiments stop confirming the higgs boson doesn't suddenly mean the higgs boson would cease to exist? That is, once we demonstrate its existence to a high degree of significance, why would we need to keep demonstrating its existence? The LHC is no longer running to establish the existence of the higgs, that's already done.
This is the problem of induction that I was referring to earlier in the discussion. Just because we have scientifically tested that the Higgs boson exists between 2012 and today doesn't mean that we can extrapolate from 12 years to infinitely far in the future.
An empiricist far in the future would still want to scientifically test that what the LHC has proven in the 2010s still holds in their time, and that the laws of physics hasn't somehow changed in the meantime that replaced the Higgs boson with some other mechanism for breaking electroweak symmetry in the standard model.
John Onstead said:
I disagree. No theory of MOND is capable of explaining the CMB or why some pairs of galaxies with the same mass and mass distribution have wildly varying stellar rotation curves. To explain the latter any such theory would have to either be internally inconsistent or vary across the universe like a scalar field, which would then just be dark matter again.
The radial acceleration relation predicted by MOND holds in galaxies at distances far beyond where dark matter ought to exist in galaxies and where the gravity ought to be Newtonian if the dark matter hypothesis held:
https://iopscience.iop.org/article/10.1088/1475-7516/2024/04/020
Wide binaries seem to follow MOND rather than Newtonian gravity - wide binaries are small enough that they shouldn't have any dark matter in them according to the dark matter hypothesis:
https://academic.oup.com/mnras/article/525/1/1401/7236869
Here's a paper from 2022 which summerises the status of dark matter and MOND using all the experimental evidence available for both, and which eventually concludes that MOND is favoured over dark matter:
https://www.mdpi.com/2073-8994/14/7/1331
Dark matter as an explanation for galaxy dynamics was a plausible hypothesis around 2018 but many of the astrophysical discoveries since 2018 has shifted the balance towards MOND.
Now, dark matter might still be needed to explain some cosmological effects such as early universe structure formation, but dark matter is having some issues there as well. Recent discoveries of very distant and very bright galaxies by the JWST indicates that structure formation in our universe is faster by three orders of magnitude than that predicted by dark matter + the rest of the concordance model of cosmology, and it is primarily the dark matter part of the model which influences early universe structure formation. So I think it likely that dark matter here will end up getting swapped out with something that can make structure form faster in the universe, even if it might not be MOND.
On the other hand, with the Hubble tension and the S8 tension and the cosmic dipole anisotropies and the CMB anisotropies and the KBC void, I think it likely that the cosmological principle itself and the FLRW metric will get thrown out the window in cosmology, and then who knows what will cosmology look like a decade from now, if cosmological dark matter is even going to be relevant then.
Interesting points! I can see the logic in how physicists would want to test to make sure that the laws of physics don't change with time. I'm not too worried about the Higgs boson in particular, though- if that were to change, we'd know. Or, well, we wouldn't, since at that very moment we'd all be wiped out by the vacuum decay bubble expanding at the speed of light!
As for dark matter vs MOND, the articles presented are very interesting and do point to anomalies with the so-called standard model of cosmology (lambda-CDM). I already think lambda-CDM needs an update, because CDM might not be the best fitting model of dark matter (others exist, such as fuzzy dark matter, the dark sector with multiple dark matter particles, self-interacting dark matter, warm dark matter such as due to right handed neutrinos, etc.) and dark energy might not be as constant as we'd thought thanks to DESI results.
At the same time each article presented has also had a counter-article published that refutes the claims. For instance, the wide binary data from GAIA was a major point of controversy last year, with articles firing back and forth from both sides. From my perspective at least, the matter seemed settled by the paper which not only indicated a very low probability for MOND, but also explained why the authors of the paper you presented got the results they did: https://academic.oup.com/mnras/article/527/3/4573/7342478. Fun fact, the lead author Banik of this paper is the very same author as the paper from 2022 you cited! So this paper isn't biased against MOND since the lead author was formerly one of its supporters.
The JWST results of early formation were a significant anomaly when first reported. Recent results indicate, however, that the initial results overestimated the masses of the galaxies: https://academic.oup.com/mnras/advance-article/doi/10.1093/mnras/stae1547/7697173. While anomalies remained in regards to supermassive black holes, even more recent results has indicated the problem was less to do with the standard model of cosmology and more to do with our model of how black holes formed in the early universe. Traditionally we considered SMBHs to form in the following order: stars form -> collapse -> stellar mass black hole -> black hole mergers -> SMBH. Indeed, there's not enough time for this to have been the method to form SMBH and galaxies around them to match with JWST. But when astrophysicists considered another mechanism of SMBH formation, direct collapse out of dense gas clouds in the early universe, the model matched the observations perfectly!
Interestingly about the hubble tension, an article very recently came out by Wendy Freedman: https://arxiv.org/abs/2408.06153. It uses different methods to test the hubble constant and finds some to be in agreement with the CMB data in opposition to the tension, while also confirming that there is still very much a tension when using Cepheid variable stars. While conclusions may vary, I think it's likely this points to the hubble tension not being an actual problem with cosmology, and more to do with a systematic error in Cepheid analysis.
The most interesting point/article to me is the first one, but only because it has come out recently and so not enough time has gone by for a response. But I imagine this only rules out dark matter models where dark matter solely forms very dense clumps. But astrophysicists have already been moving away from those given they cause problems, such as the "core-cusp problem" and "missing satellites problem". Other forms of dark matter that don't clump as much are still consistent with the results since the galaxy halos they form will inevitably be much larger and dissipate less suddenly than those predicted by standard CDM.
This discussion has been very illuminating! I should stop here before I derail it too much :)
John Onstead said:
Interestingly about the hubble tension, an article very recently came out by Wendy Freedman: https://arxiv.org/abs/2408.06153. It uses different methods to test the hubble constant and finds some to be in agreement with the CMB data in opposition to the tension, while also confirming that there is still very much a tension when using Cepheid variable stars. While conclusions may vary, I think it's likely this points to the hubble tension not being an actual problem with cosmology, and more to do with a systematic error in Cepheid analysis.
Meanwhile, the SH0ES group has published a paper shortly after on the exact same data set that the Freedman group used: https://arxiv.org/abs/2408.11770. They came to the exact opposite conclusion that the Freedman group did, that the H0 values for TRGB and JAGB were around the Cepheid H0 value, implying that the Hubble tension is still there and not likely due to Cepheid systematics.
John Onstead said:
I already think lambda-CDM needs an update, because CDM might not be the best fitting model of dark matter (others exist, such as fuzzy dark matter, the dark sector with multiple dark matter particles, self-interacting dark matter, warm dark matter such as due to right handed neutrinos, etc.) and dark energy might not be as constant as we'd thought thanks to DESI results.
Fundamentally, there are two different kinds of dark matter that would be needed in the current dominant paradigm, and this distinction mostly goes unmentioned in these dark matter discussions.
The dark matter required in galaxies and galaxy clusters need to be some kind of baryonic matter whose behaviour is able to minic the MOND gravity force law, and so would have to be some kind of non-cold dark matter like self-interacting dark matter or superfluid dark matter, etc.
However, the dark matter in cosmology requires that the dark matter be cold and non-baryonic, so that they can get the Lambda CDM model to fit CMB observations and early universe structure formation (prior to JWST), something that wouldn't work with the non-cold baryonic matter needed in galaxies and galaxy clusters.
So it's not that Lambda CDM needs an update to dark matter, but that it's two separate debates entirely.
The problem for ongoing particle dark matter experiments is that so far they only have been trying to discover the cold dark matter in cosmology - the WIMPs, axions, etc...
If they do end up discovering something like WIMPs or axions and confirm the dark matter part of the Lambda CDM model, it will not resolve the dark matter in galaxies question, since WIMPs and axions do not behave in a manner which can mimic MOND in galaxies.
John Onstead said:
At the same time each article presented has also had a counter-article published that refutes the claims. For instance, the wide binary data from GAIA was a major point of controversy last year, with articles firing back and forth from both sides. From my perspective at least, the matter seemed settled by the paper which not only indicated a very low probability for MOND, but also explained why the authors of the paper you presented got the results they did: https://academic.oup.com/mnras/article/527/3/4573/7342478. Fun fact, the lead author Banik of this paper is the very same author as the paper from 2022 you cited! So this paper isn't biased against MOND since the lead author was formerly one of its supporters.
The Banik paper wasn't so conclusive in my eyes because there was a lot of controversy over how he chose his sample in a manner that may have caused him to misidentify a particular MOND behaviour called the "external field effect" as Newtonian behaviour, since Banik's data never reaches into the Newtonian regime to detect any difference between the Newtonian regime and the MOND external field effect regime.
The external field effect occurs when the wide binaries are in some external gravitational field, here the Milky Way. The orbits act like they do in Newtonian gravity but where Newton's constant is multiplied by some scalar factor greater than 1 that is dependent on the MOND acceleration limit parameter and the strength of the external gravitational field and a few other things. This is discussed here:
So the best thing to do right now is to wait for more data to come in on the wide binary debate and the Hubble tension.
But I believe it likely that the data that comes in the future will simply strengthen the trends of the Hubble tension being a real issue and not just systematics, and the trends of anomalies popping up everywhere that look like MOND behaviour.
Federica Pasqualone said:
Great then, as tradition wants: Any questions? :mic:
I'm really enjoying having a discussion being actively led by someone, this is fun. Thanks @Federica Pasqualone
Morgan Rogers (he/him) said:
Federica Pasqualone said:
Great then, as tradition wants: Any questions? :mic:
I'm really enjoying having a discussion being actively led by someone, this is fun. Thanks Federica Pasqualone
I'm glad to hear that! @Morgan Rogers (he/him) Anytime!
Madeleine Birchfield said:
Since we're here at the frontiers of physics, I'm really tempted to go into another interpretation of "One universe as a foundation & friends" and talk about our current theories of physical cosmology.
Federica Pasqualone said:
Anyone on multiverse theory? :rolling_on_the_floor_laughing:
All right, let's talk about the multiverse.
The multiverse is a philosophical hypothesis which says that there is more than one universe. This hypothesis is currently untestable for the forseeable future, and so it isn't scientific.
In physics, the multiverse comes into multiple flavours.
There is the string theory multiverse, where the multiverse is just a philosophical argument to justify why there are at least possible string theories. I don't really take these multiverse arguments seriously because ultimately most string theorists still hope for and believe in the one true unified field theory of everything, which ultimately means selecting one of the string theories and arguing why this one is favoured over the other string theories. The multiverse argument mostly is used to shut up critics of string theory who say that string theories means that string theory is a worse framework than sticking with the existing framework of quantum field theory or moving to some other alternative framework for a theory of everything, because it's nearly impossible to find what they are looking for from theories, or because theories means that string theory becomes vacuous because it can basically explain everything and anything imaginably possible.
Then there is the multiverse from inflation theory.
Inflation theory is a philosophical theory from physical cosmology which says that, assuming that the universe is expanding, in the very early universe there was a period of exponential expansion of space. This was originally motivated as a way to make space flat, but then it was later discovered that in many models of inflation, the exponential expansion of space tends to expand forever, with only pockets of space with slower expansion.
If we interpret pockets of space with slower expansions as universes, the fact that there are many of these pockets of space separated by a vast gulf of exponentially expanding space means that there are multiple universes, hence the multiverse.
However, this is just one interpretation of the models of inflation, and another interpretation is just that there is only one universe where space just happens to expand faster in some places and slower in others and as a result it's impossible to get from one patch of slowly expanding space to another patch of slowly expanding space due to the exponentially expanding space in between, giving the illusion that there are multiple universes.
The reason why inflation theory, while part of physics via physical cosmology, isn't a scientific theory, is because there is no observational evidence for anything before the cosmic microwave background. We don't even know if there is even a cosmic neutrino background, and nearly all inflation theories would have inflation occurring and ending orders of magnitudes before the cosmic neutrino background.
So all that is occurring is philosophical theorising from mathematical models and numerical simulations using the big bang theory as an underlying framework.
Welcome back guys! :smirk:
First off, lemme thank you once more @Madeleine Birchfield for the wonderful explanations and all the readers (silent and not) for making this discussion so successful! Thank you all!
Federica Pasqualone said:
Anyone on multiverse theory? :rolling_on_the_floor_laughing:
It seems here I cannot make jokes anymore ... :joy:
I am wondering how many of you are familiar with the basics of cosmology theory ... the physical one :)
Lemme pull out a good book ... :heart_eyes:
Federica Pasqualone said:
Federica Pasqualone said:
Anyone on multiverse theory? :rolling_on_the_floor_laughing:
It seems here I cannot make jokes anymore ... :joy:
I just wanted to talk about a topic that was related to "One universe as a foundations & friends", and I felt the multiverse was suitably close to that.
I’m surprised you described two multiverse theories, neither of which is the many worlds interpretation of quantum mechanics! That’s the only one I knew about: as I understand it, it basically says that when a wave function collapses, the universe actually branches out into versions where each possibility happened. This makes for quite a lot of near-parallel as well as very different universes; it’s quite mind-boggling to try to imagine any space that might parametrize them all.
But i actually forgot another version of the multiverse.
There is a multiverse in mathematics. It's called the set-theoretic multiverse, and it's basically the philosophical position of pluralism applied to set-theoretic foundations.
Madeleine Birchfield said:
Federica Pasqualone said:
Federica Pasqualone said:
Anyone on multiverse theory? :rolling_on_the_floor_laughing:
It seems here I cannot make jokes anymore ... :joy:
I just wanted to talk about a topic that was related to "One universe as a foundations & friends", and I felt the multiverse was suitably close to that.
That's an amazing topic! and you're right: It fits well with the title.
And I wanted also to suggest, as I usually do here, a book: "Introduction to cosmology" by Barbara Ryder :nerd: :books:
Madeleine Birchfield said:
But i actually forgot another version of the multiverse.
There is a multiverse in mathematics. It's called the set-theoretic multiverse, and it's basically the philosophical position of pluralism applied to set-theoretic foundations.
:rolling_on_the_floor_laughing: please now let's go back to physics hahhahahah
Kevin Carlson said:
I’m surprised you described two multiverse theories, neither of which is the many worlds interpretation of quantum mechanics! That’s the only one I knew about: as I understand it, it basically says that when a wave function collapses, the universe actually branches out into versions where each possibility happened. This makes for quite a lot of near-parallel as well as very different universes; it’s quite mind-boggling to try to imagine any space that might parametrize them all.
The philosophy of quantum mechanics is a subject that I haven't gone too deeply into.
The thing is that it's primarily talking about particle quantum mechanics, while we know from quantum field theory and the standard model of particle physics that particles are just excitations of fermion fields and boson fields which span the entire universe, including any of the detectors themselves which are measuring the particles.
What gets interpreted as wave function collapse is actually in my opinion the collapse of the particle approximation - the particle has interacted with a detector, which is a very complex quantum system made of interacting fermion and boson fields - and the particle is no longer a free particle, it has become part of the complex quantum system that is the detector. One has to take into account the state of the fermion and boson fields which comprise the detector, which nobody really does when talking about wave function collapse.
This reminds me of how I started a couple of talks last year speaking of observables in the quantum world.
Here it is the video: https://www.youtube.com/watch?v=NvzSLByrw4Q
Popcorn? :popcorn:
Thanks @Kevin Carlson for bringing it up!
There are a lot of issues in particle quantum mechanics which simply disappear when you move to relativistic quantum field theories like the standard model.
For example, in quantum mechanics, there is the delayed-choice quantum eraser experiment, which seems to indicate that retrocausality is possible. However, Philippe Eberhard proved here that it is actually never possible to detect retrocausality in a relativistic quantum field theory, and our universe is currently described by one - the standard model of particle physics.
Madeleine Birchfield said:
What gets interpreted as wave function collapse is actually in my opinion the collapse of the particle approximation - the particle has interacted with a detector, which is a very complex quantum system made of interacting fermion and boson fields - and the particle is no longer a free particle, it has become part of the complex quantum system that is the detector.
I'm a bit confused by this response. Are you saying that the state of the detector should determine whether I measure spin up or spin down?
The basic point seems to be that there were two possible outcomes before the measurement, and either both happened, but can't interact, or only one does. How do fields address this dilemma?
Spencer Breiner said:
Madeleine Birchfield said:
What gets interpreted as wave function collapse is actually in my opinion the collapse of the particle approximation - the particle has interacted with a detector, which is a very complex quantum system made of interacting fermion and boson fields - and the particle is no longer a free particle, it has become part of the complex quantum system that is the detector.
I'm a bit confused by this response. Are you saying that the state of the detector should determine whether I measure spin up or spin down?
The basic point seems to be that there were two possible outcomes before the measurement, and either both happened, but can't interact, or only one does. How do fields address this dilemma?
Lemme explain this business a little more. If you want to measure the length of your desk, for instance, what do you do? You grab a metric tape and read the measurement. Nothing happens neither to the desk nor to the tape, isn't it?
Madeleine Birchfield said:
More importantly i guess, do you use excluded middle in your logic or not?
Only true logicians will answer "yes" here
sorry to hijack back the discussion to logic
And if you repeat the measurement the day after most probably (up to a ) you will read the same result. Here is not the case, @Spencer Breiner. The fact that in order to make a measurement an instrument has to be introduced changes the whole thing!
fosco said:
sorry to hijack back the discussion to logic
haha no worries! Are you suggesting we should talk about the set-theoretic multiverse theory? @fosco :rolling_on_the_floor_laughing:
Can you summarize what the discussion has been about so far? It feels.. scattered.
Sure! After pointless topology, we moved to open issues in theoretical physics and then we started giving some arguments and references on the Standard Model, Cosmology and Multiverse theory. There were comments about both the experimental physics side and the philosophical one.
Am I forgetting anything?
fosco said:
Can you summarize what the discussion has been about so far? It feels.. scattered.
So we began with discussing the different kinds of predicativity in mathematics, then made a huge detour into discussing intuitionistic Zermelo-Frankel set theory. Then I asked Federica if she used excluded middle and then we got into a discussion on how it would be a real challenge for constructive mathematicians to do mathematical physics. Then it segued to a discussion about the path integral in quantum field theory and about the standard model of particle physics, and then we started talked about the theories of everything.
Then I had a conversation about my views on dark matter vs MOND and the Hubble tension, before finally ending up at the multiverse.
I was surprised when you said that no one seems to be doing / have done measure theory without LEM (that's the bunch of messages where I stayed for a while)
I think the challenge then, besides particle physics, might be making ergodic theory constructive (and that would be a fun thing to read)
also, I'd like to hear Urs opinion on the matter...
fosco said:
I was surprised when you said that no one seems to be doing / have done measure theory without LEM (that's the bunch of messages where I stayed for a while)
Actually, my statement is incorrect. Steve Vickers did do some measure theory in geometric logic, and thus did not use LEM, such as in this paper:
https://arxiv.org/abs/2312.05228
However, it was mostly in the context of real analysis. I am not aware of anybody has pushed it towards probability theory yet.
I'll have to go through Steve Vickers's sources and other papers to see if this was entirely developed on his own or if he took some developments from other authors as well.
fosco said:
also, I'd like to hear Urs opinion on the matter...
From what Urs Schreiber wrote in [[Science of Logic]], it seems that he believes that excluded middle should hold for at least the discrete types / infinity-groupoids since he identifies the sharp modality with , which wouldn't be the case in constructive mathematics.
But it would be better to ask him directly.
fosco said:
I was surprised when you said that no one seems to be doing / have done measure theory without LEM (that's the bunch of messages where I stayed for a while)
From what I can also tell, @Toby Bartels has also done some constructive measure theory as well; see for example [[Cheng space]] (who he attributed to a Henry Cheng but provided no references) and [[measurable locale]] and [[locale of real numbers]]. But the things that Bartels put on the nLab seem to be either from some unknown source or original research.
But I don't really know anything else doing constructive measure theory. Measurable locales were developed by Dmitri Pavlov and he uses classical logic, there is a related approach using -locales by Alex Simpson but he also uses classical logic.
A ref that comes to mind is http://www.contrib.andrew.cmu.edu/~awodey/students/jackson.pdf which although is classical (is it? I just skimmed the intro and I have read a few paragraphs years ago..)
seems to connect well with whatever theory of measurable locales is out there.
Now that I'm digging into the references in the Steve Vickers paper, there is actually a lot of constructive measure theory, but none of it is actually mentioned on the nLab.
Example, this paper by Thierry Coquand and Bas Spitters: https://arxiv.org/pdf/0808.1522
Then in the references of the the Coquand-Spitters paper there is this paper by the first author:
https://link.springer.com/article/10.1007/s11117-004-7399-0
and this paper by Coquand and Erik Palmgren:
https://link.springer.com/article/10.1007/s001530100123
and this paper by Spitters:
https://www.cs.au.dk/~spitters/obs.pdf
nice
Before he moved on to study homotopy type theory, Bas Spitters himself has authored a few papers with Chris Heunen above which has interpreted some quantum physics in constructive mathematics:
Chris Heunen said:
But for that theorem you in fact don't need Soler! Matthew Di Meglio and I figured out how to relate categorical colimits to analytic limits of real numbers directly, which is much more satisfying. See also Matt's nice blog post on the process of how that came to be.
Well here the issue is that I'm not aware of anybody who has proven that DeMarr’s theorem, which replaced Soler's theorem, holds in constructive mathematics.
I suspect that DeMarr's theorem won't hold in constructive mathematics because there are multiple Archimedean ordered Heyting fields (which are partially ordered semifields) which are sequentially complete and thus are monotone sequentially complete, but are not provably equivalent to each other.
And there are axiom sets that one can add to constructive mathematics to show that there are two provably inequivalent sequentially complete Archimedean ordered fields: the LPO and Brouwer's continuity principle for Dedekind reals show that the Escardo-Simpson reals (the initial sequentially compete Archimedean ordered field) are inequivalent to the Dedekind reals (the terminal sequentially compete Archimedean ordered field) - the former is a discrete field with LPO while the latter is provably not a discrete field with Brouwer's continuity principle.
I also suspect the partial order wont be enough and one actually needs the constructive strict order structure on semifields.
Madeleine Birchfield said in part:
From what I can also tell, Toby Bartels has also done some constructive measure theory as well; see for example [[Cheng space]] (who he attributed to a Henry Cheng but provided no references) and [[measurable locale]] and [[locale of real numbers]]. But the things that Bartels put on the nLab seem to be either from some unknown source or original research.
Henry Cheng's work is described in the updated version (by Douglas Bridges) of Errett Bishop's book on constructive analysis; that's the source for the nLab page. There are references to Cheng's papers in that book, but I haven't ever read those. The contribution by Todd and me described there as abstract nonsense is really just the observation that Cheng was working with a Chu space; this also connects to Mike Shulman's work on the antithesis interpretation of linear logic in constructive mathematics (which had not been done yet).
The only thing constructive about measurable locales is that since they have so many equivalent formulations, I have an intuition that one of them is more likely to work out constructively, so I said so. It is important to realize, since locales provide a constructive approach to topology, that they do not as yet provide a constructive approach to measure theory, but I don't know how to fix that.
As for the locale of real numbers, the constructive formulation and proof of Cousin's Theorem, and the resolution that follows of the paradox that the set of real numbers has measure zero in Russian constructivism, are my original work as far as I know. But I never worked out anything beyond what I wrote there on the nLab. It makes me feel better to think that a constructive treatment of the Henstock–Kurzweil integral is not out of the question, but I don't have such a treatment myself.
Federica Pasqualone said:
And if you repeat the measurement the day after most probably (up to a ) you will read the same result. Here is not the case, Spencer Breiner. The fact that in order to make a measurement an instrument has to be introduced changes the whole thing!
I still don't see the relevance. Can you be more specific?
Are you suggesting that the measurement is deterministic given the joint state of particle and measuring device? If not, how does the probability of the unrecorded result migrate (collapse?) to the observed branch? In particular, what do fields have to do with this difference.
Also, I'm under the impression that repeated measurements after the first will continue to see the same result, modulo other interactions with the particle.
Toby Bartels said:
Henry Cheng's work is described in the updated version (by Douglas Bridges) of Errett Bishop's book on constructive analysis; that's the source for the nLab page. There are references to Cheng's papers in that book, but I haven't ever read those.
Is this the textbook you are referring to?
https://link.springer.com/book/10.1007/978-3-642-61667-9
Madeleine Birchfield said:
Is this the textbook you are referring to?
Yes! And if you scroll down, Springer offers a PDF file of the end matter, including the bibliography, where you can find Cheng's name three times. So presumably those are the papers where this is developed. (It's also summarized in Chapter 6 of the book, Integration, which I read about 20 years ago.)
Toby Bartels said:
Madeleine Birchfield said:
Is this the textbook you are referring to?
Yes! And if you scroll down, Springer offers a PDF file of the end matter, including the bibliography, where you can find Cheng's name three times. So presumably those are the papers where this is developed. (It's also summarized in Chapter 6 of the book, Integration, which I read about 20 years ago.)
What seemed to have tripped people up in the past is that on the nLab you had linked to "[[Handbook of Constructive Analysis]]", which was not the name of the above book, and lead to a non-existent page on the nLab. The actual title of the book was "[[Constructive Analysis]]" and does lead to nLab's article on the Bishop and Bridges textbook.
Mike Shulman asked about this a few years ago on the nForum but didn't get a definitive yes-or-no answer to his question.
Madeleine Birchfield said:
What seemed to have tripped people up in the past is that on the nLab you had linked to "[[Handbook of Constructive Analysis]]", which was not the name of the above book, and lead to a non-existent page on the nLab. The actual title of the book was "[[Constructive Analysis]]" and does lead to nLab's article on the Bishop and Bridges textbook.
Mike Shulman asked about this a few years ago on the nForum but didn't get a definitive yes-or-no answer to his question.
Sorry, I didn't notice that mistake! Yes, HCA should be FCA (which strictly speaking is only the title of the first edition, by Bishop alone; Bridges shortened the name for the revised edition, which is the one that covers the work by Cheng).
What arguments are there that basing constructive measure theory on locales in particular is a good idea (over and above something like the metric Boolean algebras of Coquand and Palmgren or even a point-set approach as in Bishop and Cheng)?
Locales seem very poorly suited for recovering the already known computational content of measure theory. They're complete lattices and don't have well-behaved complements. You cannot make a non-trivial constructive measure theory in which measures are Dedekind or Cauchy reals and the collection of measurable 'things' is a complete lattice or even a countably complete lattice: There is a computably presentable open subset of whose measure is not computable.
James E Hanson said:
What arguments are there that basing constructive measure theory on locales in particular is a good idea (over and above something like the metric Boolean algebras of Coquand and Palmgren or even a point-set approach as in Bishop and Cheng)?
In constructive mathematics, [[measurable locales]] are a misnomer because they cannot be proven to be locales constructively. So in the conversation above we aren't talking about using locale theory for measure theory.
I guess then my question is why is there a mystery about the 'right' way to do measure theory constructively? We already have a constructive point-free approach to measure theory (i.e., metric Boolean algebras) that is about as good as one could plausibly hope for and agrees with real analysts' existing intuition about what the 'actual content' of measure theory is. In order to make it entirely smooth, one either needs to work with the Cauchy reals and track codes of objects explicitly (or implicitly, as ends up happening with setoid presentations such as the one in Coquand and Palmgren) or one needs countable choice, but this is already true classically and countable choice has a computational interpretation.
For example, if we want to think about the typical measure theory on , we can start by looking at the formal Boolean algebra of finite unions of intervals of the form . It is easy to then define the typical notion of measure on and show that is a metric. We can then take the completion of with regards to this metric, which gives the formal probability algebra of 'measurable subsets of '. Call this . Then you show that the measure and the meet, join, and complement operations are all Lipschitz on and so extend canonically to all of , yielding a metric Boolean algebra. Then you can then start defining things like spaces using formal simple functions (or you could even start with formal simple functions based on ).
You can even connect it back to a more traditional point-set approach. In particular, I believe you can define a constructive point-set notion of Lebesgue measurable subset of that also recovers and has other nice properties such as positive measure sets being inhabited and knowing that for a sequence of measurable sets , the union is measurable if exists in the Cauchy reals. Bishop and Cheng already identified that this extra condition is really the only thing you need to make point-set measure theory work constructively. The possibility of such an approach to measure theory follows from their work but is also implicit in some work on computable measure theory (such as that of Schnorr). Again this needs countable choice, but without it one should be able to more or less recover an analogous theory using coded sets (such as in the chapter on choiceless measure theory in Fremlin). Measure theory on non-finite measure spaces (such as ) shouldn't be that much harder, if a little annoying.
I guess then my question is why is there a mystery about the 'right' way to measure theory constructively?
Because most likely when @Toby Bartels was writing the nLab's article on measurable locales in 2011 he wasn't aware of Coquand and Palmgren's point-free approach to measure theory.
As I mentioned above, there's a lot of constructive measure theory which isn't on the nLab.
Going back to a comment posted by John Baez,
John Baez said:
As for mathematical physics using constructive reasoning, it will definitely be fun for constructivists to build up the framework of analysis required to do more and more mathematical physics. But to get more mathematical physicists interested, it would be helpful to come up with some examples of where constructivism might help us understand physics. E.g. some ways in which a locale of real numbers is "better" - for the purposes of physics, not logic! - than the usual set of real numbers.
The issue here is that, already in constructive real analysis, there are multiple notions of real numbers which are not provably the same but which are each important in the subject.
In some constructive approaches to integration, lower integrals are defined using the lower Dedekind reals, while upper integrals are defined using the upper Dedekind reals, and neither the lower reals nor the upper reals are provably equivalent to the Dedekinds themselves without excluded middle. The lower reals are also used to define metric spaces in constructive mathematics.
For those interested in calculating digit representations of real numbers, the relevant notion of reals here are the Cauchy real numbers, which are not provably equivalent to the Dedekind reals without either excluded middle or countable choice.
There is also, as you mentioned, the locale of real numbers which is useful for proving some topological and analytic properties of the real numbers which cannot be proven in the point-set manner for the Dedekind reals.
I have a feeling that in constructive mathematical physics, many of these sets of real numbers are going to be used just to get some of the proofs and concepts working, just like in constructive real analysis.
Morning everyone! We are not taking any day off here! Very good :smirk:
Have a wonderful start of Fall semester 2024!
Anyways, Chu spaces, bornologies .... all great stuff! Thanks a lot for the references and the comments on the difficulties of measure theory ... and intuistionistic logic :heart_eyes: !
We are learning so much these days!
Before coming back to physics, let me add a little caveat to the whole enterprise - a caveat that sort of was also in the remark of John Baez: If you want to help the physics department, do not strive to write down 1.5K pages of abstract and rigorous math without any pragmatism, but rather let the spirit of physics guide your academic writing! :diya_lamp:
Federica Pasqualone said:
Before coming back to physics, let me add a little caveat to the whole enterprise - a caveat that sort of was also in the remark of John Baez: If you want to help the physics department, do not strive to write down 1.5K pages of abstract and rigorous math without any pragmatism, but rather let the spirit of physics guide your academic writing! :diya_lamp:
Yeah, well the same could be said about many other things being done in mathematical physics. I don't think a lot of physicists actually care about stuff like Grady's and Pavlov's proof of the [[geometric cobordism hypothesis]] because it is a whole bunch of rigourous math which doesn't have much pragmatic relation to the stuff that physicists actually work with.
Similarly, physicists don't actually care whether the path integral can be formally defined in mathematics, while mathematicians working in mathematical physics do.
and also, since we are gossiping about him so much :rolling_on_the_floor_laughing: ...
You can recognize truth by its beauty and simplicity. When you get it right, it is obvious that it is right—at least if you have any experience—because usually what happens is that more comes out than goes in.... The inexperienced, the crackpots, and people like that, make guesses that are simple, but you can immediately see that they are wrong, so that does not count. Others, the inexperienced students, make guesses that are very complicated, and it sort of looks as if it is all right, but I know it is not true because the truth always turns out to be simpler than you thought.
~ Richard Feynman :heart:
James E Hanson said:
What arguments are there that basing constructive measure theory on locales in particular is a good idea (over and above something like the metric Boolean algebras of Coquand and Palmgren or even a point-set approach as in Bishop and Cheng)?
Locales seem very poorly suited for recovering the already known computational content of measure theory. They're complete lattices and don't have well-behaved complements. You cannot make a non-trivial constructive measure theory in which measures are Dedekind or Cauchy reals and the collection of measurable 'things' is a complete lattice or even a countably complete lattice: There is a computably presentable open subset of whose measure is not computable.
I don't know enough about measure theory per se to speak on it definitively, but I can say something about probability theory. I think there are strong arguments for including the topological structure of objects more or less in all circumstances, but especially in constructive mathematics. So in particular, I think it is helpful to do so when dealing with measures. Classically, the theory of valuations agrees with that measures in 'good' situations (Radon measures on compact Hausdorff spaces). It also works well constructively. Possibly using only opens leads to some issues with certain more advanced topics in measure theory, but analysis works pretty smoothly. I guess time will tell whether valuations are sufficient for practical considerations, but I tend to think they will be.
You are right that you don't use Dedekind reals for valuations (and certainly not Cauchy reals). Instead you use lower reals. I think this is intuitively the correct thing to do and I can say more on this if you like. I have some thoughts about how to recover Dedekind reals in certain cases, but it is work in progress, so I would rather not say too much about it yet.
If I may ask, the lower and upper reals coincide with the Dedekind reals in classical mathematics. What other axioms imply that these sets coincide with each other?
They only coincide (modulo infinity) if you forget the topologies, so from the pointfree perspective they are always different. I know very little about 'pointy' constructive analysis, so I couldn't say.
Graham Manuell said:
They only coincide (modulo infinity) if you forget the topologies, so from the pointfree perspective they are always different.
This kind of backs my point I made earlier in the thread that if one wants to use the locale of real numbers in constructive mathematical physics, chances are that one would end up using at least three locales of real numbers - those whose space of points are the lower reals, upper reals, and Dedekind reals, because of the heavy reliance on measure theory via integration and probability in mathematical physics.
Absolutely. But I think there are already three spaces being used classically. It's just that people don't always pay careful attention to the topologies.
For example, consider how lower semicontinuous functions are sometimes used in analysis.
I'm just trying to think how this would end up working in real cohesion.
Assuming excluded middle for the discrete -groupoids, one would have Dedekind reals , finite upper reals , and finite lower reals all be inequivalent cohesive -groupoids since they come naturally equipped with different topologies - order topology, upper semicontinuity, and lower semicontinuity respectively, but one would have weak equivalences since they all have the same points and thus are the same -groupoid when equipped with the discrete topology.
Madeleine Birchfield said:
If I may ask, the lower and upper reals coincide with the Dedekind reals in classical mathematics. What other axioms imply that these sets coincide with each other?
This is equivalent to LEM. Fix a sentence and consider the lower real . If we assume is a Dedekind real, then we have that either or . In the first case we have and in the second we have .
Graham Manuell said:
I think there are strong arguments for including the topological structure of objects more or less in all circumstances, but especially in constructive mathematics.
And what exactly are these arguments? I ask because from my perspective the relationship between measure theory and topology is subtle. , , and are all equivalent as measure spaces (with their standard measures) and it's fairly straightforward to make sense of this fact constructively, so already we can see that considering things up to measure erases a huge amount of topological information (compactness, connectedness, etc.). Moreover one of the most important structural facts about regular measures is that every measurable set is equivalent to a set modulo a null set.
The other problem is just that, on a practical level, complementation is an extremely important basic operation in probability theory. It allows you to extent a lot of intuition gleaned from finite event spaces to infinite event spaces. Obviously exact complementation is problematic constructively, but in measure theory all that really matters is complementation modulo a null set, and this does work well constructively.
This is what I find so uncompelling about Simpson's locale of random reals and presumably other localic approaches to measure theory. There are dense open subsets of that have measure less than , but the Heyting complement of these open sets is empty. Insisting on basing measure theory solely on open sets materially changes the mathematical content and character of the field.
James E Hanson said:
This is what I find so uncompelling about Simpson's locale of random reals and presumably other localic approaches to measure theory.
IIRC Alex Simpson is also working with excluded middle and dependent choice the entire time, so this is also irrelevant to constructive mathematics.
I brought up measurable locales and Alex Simpson's approaches as alternative approaches to measurable sets that I knew of, but they all used excluded middle to develop their theories and so aren't really suitable for constructive mathematics.
Similarly goes for the categorical approaches - Alex Simpson's topos of random probability sheaves assumes a Boolean topos satisfying dependent choice, and so isn't constructive.
James E Hanson said:
Madeleine Birchfield said:
If I may ask, the lower and upper reals coincide with the Dedekind reals in classical mathematics. What other axioms imply that these sets coincide with each other?
This is equivalent to LEM. Fix a sentence and consider the lower real . If we assume is a Dedekind real, then we have that either or . In the first case we have and in the second we have .
I suspect that this remains true if we take some -subframe of the set of truth values , and use that to construct -lower reals, -upper reals, and -Dedekind reals using functions or pairs of such functions. Then that every -lower or -upper real is a -Dedekind real is equivalent to LEM for all truth values in . For the initial such -frame this would result in LPO, and for the set of truth values itself this would result in the usual LEM.
Well, in any case I guess it would be nice to read through @Alex Simpson work ...
Maybe over the weekend we can also have a little discussion on the theory of locales and frames, just basic level - as we did for predicativity. So people can engage, ask questions, usw. Good idea?
(For people not familiar with German, usw = und so weiter = et cetera -- I assume this is what is meant)
Todd Trimble said:
(For people not familiar with German, usw = und so weiter = et cetera -- I assume this is what is meant)
Probably, she did her masters at Göttingen before going to CMU so she should have some familiarity with the German language.
Federica Pasqualone said:
Maybe over the weekend we can also have a little discussion on the theory of locales and frames, just basic level - as we did for predicativity. So people can engage, ask questions, usw. Good idea?
if I have time on the weekend, which I do not know yet.
Thanks a lot for explaining my abbreviations hahhah @Todd Trimble I am particularly skilled at code mixing when I am in a rush! :grinning:
@James E Hanson One reason is that every function that can be carried out in practice is going to be continuous, so it makes sense to record this information. Another is that almost all nonconstructive principles become valid when phrased in terms of locales instead of sets (for example, the product of nontrival sets being nontrivial). A third is that a topology is essentially precisely what is necessary to specify the points of an object not only in the current topos, but in every Grothendieck topos and so results that take it into account tend to be stable under change of base. Finally, it seems to get rid of a lot of pathologies, such as Banach-Tarski and the like.
Since I'm arguing for remembering the topology I'm not moved by the fact that certain different spaces coincide when you forget it.
Sometimes there will be open complements up to measure zero and this might be interesting to explore. I guess I am willing to bite the bullet on not being able to have complements in general that behave in the same way as the original opens (unlike closed sublocales, which can, of course, still be discussed).
I don't really understand what your criticism has to do with Simpson's approach, since he extends the valuation to every sublocale. There are still no complements in general, but the dense open sets you suggest do have complements in his approach. The pesudocomplement is a red herring. It doesn't play any role in any of these approaches as far as I know.
I suppose I agree that basing things on opens changes the character of the field, but I think it is for the good, or at least is unavoidable. I don't agree that it changes the content.
@Graham Manuell I'm sorry but I'm not really moved by your arguments either. Basically, the broad issue to me is that real analysts already know what point-free measure theory is. Constructive mathematicians already know how to do this point-free measure theory constructively in a down-to-earth way that can be implemented in real-world computations. (In terms of your first statement this corresponds to the fact that the meaningful topology in measure theory is the topology induced by the metric on measurable sets.) Category theorists already know how to characterize Lebesgue integration in terms of a universal property. (And in particular, note that , , and all satisfy this universal property, because measure spaces themselves are fundamentally not topological.) I don't really see how you can claim that basing measure theory on open sets is 'unavoidable' when people have evidently been able to avoid it just fine.
At the end of the day, what it means to be able to take complements in measure theory and probability theory is that if is a measurable function or a random variable, then is a measurable function or a random variable. You can start a theory of integration with something like semicontinuous functions on a locale (or something like the duals of maps from lower sets in to sublocales of your locale), but at some point you're going to want to subtract one function from another or want to have your integrals be Cauchy or Dedekind reals so that you can actually produce a tangible number in a real-world computation, and at that point you need to introduce some kind of two-sided regularity condition and isolate a special class of 'measurable' functions. This is what I don't find compelling about Simpson's approach. It's often presented as if it 'fixes the existence of non-measurable sets,' but in order to actually do functional analysis with it you need to reintroduce something equivalent to the concept of measurability at some point.
@James E Hanson I'm not sure how down-to-earth it is to use a theory where you are required to assign measures to sets for which you have no ability to tell whether a point actually lies in that set (or similar such question). I don't really understand the operational meaning of saying that a randomly chosen point lies in some arbitrary Borel set with some probability. Maybe there is some way to make sense of this (which I would be interested to hear about), but it certainly seems less conceptually straightforward than the usual pointfree approach using valuations.
In any case, "unavoidable" was surely too strong, since almost anything can be avoided with sufficient effort, but I was mainly thinking of the case where we do want to consider the topologies (which for me is always). I can think of reasonable probability distributions on Sierpinski space where the probability of choosing is just not a Dedekind real. It would seem that such cases would not fit into these other approaches and this feels like a deficiency.
If you are talking about random variables instead of measures, then these can be (Dedekind-)real-valued and there is no issue with taking . You can integrate a (Dedekind-)real-valued function over a compact Hausdorff locale with a finite valuation to give a Dedekind real. (Of course, we necessarily restrict to continuous functions here, which I do not see as a disadvantage.) Just because the valuations take values in one-sided reals, does not mean we cannot consider real-valued integrals. In general, the functional analysis approach to measures (if done in an appropriately pointfree way) should be equivalent to valuations in this case. (I don't know if all of the details of this last part have actually been worked out yet, but a fair bit has been.)
I don't know the details of how integration would be handled in Simpson's approach (though I would be surprised if you really do need some restricted concept of measurable sublocale; measurable functions, sure, but we'd probably just restrict to continuous ones), but that approach is slightly different to the one I can advocating for (though I still think his approach is pretty cool).
Good morning measure theorists and QFT lovers! :sunglasses:
I am still down for an amazing basic discussion on lattice theory, don't worry ... coming soon! :books:
But also (metacomment): I hope we will have some rooms for group work at the next ACT2025! :sparkles:
It sounds like we have a lot to do here ... :nerd: don't we?
Graham Manuell said:
(Of course, we necessarily restrict to continuous functions here, which I do not see as a disadvantage.)
Physicists (and other appliers of mathematics) use and integrate discontinuous functions all the time. Obviously it's heavily debatable whether there are any truly discontinuous phenomena in reality, but working with discontinuous functions is a practically useful idealization, much like the real numbers themselves. (See, for instance, shock wave weak solutions of PDEs.) If your theory of integration can only handle continuous functions (under the assumption that one wants to be able to subtract functions), then I'm sorry but you are changing the mathematical and practical content of the field.
Graham Manuell said in part:
I'm not sure how down-to-earth it is to use a theory where you are required to assign measures to sets for which you have no ability to tell whether a point actually lies in that set (or similar such question).
You can't even always tell whether a given point lies in a given open set. (If it does lie in the open set, then eventually you'll be able to tell, but if it doesn't, then you might not.) At least if the open set is a located interval, then for almost every point you can tell whether it belongs to the set, but even that doesn't work for arbitrary open sets. I agree that any open set can be given a lower real as a measure, but I can also see why one might consider a set to be measurable only when its measure is a Dedekind real (and perhaps the more general case should only be considered the outer measure).
I would be surprised if you really do need some restricted concept of measurable sublocale; measurable functions, sure, but we'd probably just restrict to continuous ones
Constructive analysis knows a lot of discontinuous functions that are defined almost everywhere, and I'd like to integrate those too.
Graham Manuell said in part:
Classically, the theory of valuations agrees with that measures in 'good' situations (Radon measures on compact Hausdorff spaces).
This reminds me that when Bourbaki got to measure spaces, they decided to only deal with Radon measures on (I think) locally compact Hausdorff spaces.
Measurable locales similarly capture only ‘good’ situations, specifically those that satisfy big theorems like the Radon–Nikodym Theorem, which are often stated only for σ-finite measures but also apply to some others. (This is why I got interested in them; and since I also like things to be constructive, that's why the nLab has some musings about how to make them constructive. The idea that locales=constructive is a red herring.)
James E Hanson said:
Physicists (and other appliers of mathematics) use and integrate discontinuous functions all the time.
Physicists also use and integrate things all the time that aren't even functions, like "-functions". These can be defined mathematically as distributions, elements of the dual space of a space of smooth (and in particular continuous) functions. Discontinuous functions can also be considered (up to almost-everywhere equality) as particular distributions, even if the only actual "functions" are continuous. I don't know what the theory of distributions looks like constructively, but it would be one approach to make sense of the discontinuous functions used in applied mathematics, which would generalize the accepted way of making sense of the non-functions used in applied mathematics.
@James E Hanson I'm currently working on a paper with @Peter Faul on how to handle 'discontinuous' functions of the kind needed for things like shock waves. I don't think they would actually pose an issue for integration in this system either. I guess I don't really care whether you think it changes the content or not. In so far as it differs, why are you so certain that what the other approaches do is correct? I do care that it can model all physical phenomena, but I believe that with sufficient care it can. As @Mike Shulman notes, distributions should also be able to be accommodated.
Toby Bartels said:
You can't even always tell whether a given point lies in a given open set. (If it does lie in the open set, then eventually you'll be able to tell, but if it doesn't, then you might not.)
Indeed, this is why the measure needs to take values in the lower reals instead of the Dedekind reals.
At least if the open set is a located interval, then for almost every point you can tell whether it belongs to the set, but even that doesn't work for arbitrary open sets. I agree that any open set can be given a lower real as a measure, but I can also see why one might consider a set to be measurable only when its measure is a Dedekind real (and perhaps the more general case should only be considered the outer measure).
Okay, I can understand this perspective. This result about located intervals is very interesting and feels like it might be related to something I have been thinking about regarding valuations. Do you know somewhere can I read about this?
Toby Bartels said:
Constructive analysis knows a lot of discontinuous functions that are defined almost everywhere, and I'd like to integrate those too.
I haven't given it enough thought yet whether this is possible in the approach I am advocating for, but in Simpson's approach perhaps they they can be understood as maps out of the smallest sublocale of full measure?
Graham Manuell said in part:
Toby Bartels said:
At least if the open set is a located interval, then for almost every point you can tell whether it belongs to the set, but even that doesn't work for arbitrary open sets.
This result about located intervals is very interesting and feels like it might be related to something I have been thinking about regarding valuations. Do you know somewhere can I read about this?
It's nothing deep. It's just that if the interval is located, then it has a (possibly infinite) Dedekind real as both lower and upper endpoint (which depending on what one means by ‘interval’ one might assume anyway); and any point (Dedekind real) apart from the endpoints is either in the set or in its exterior (both open sets, so you can tell which). If ‘almost every’ means anything constructively in a pointwise context, then that includes everything distinct from a given two points.
But to be formal about it: If an interval 𝐼 is located, then its upper bound is given by the two-sided Dedekind cut
and similarly for the lower bound. And these are Dedekind reals because, given rational 𝑟 < 𝑠, the locatedness of 𝐼 immediately tells us that 𝑟 belongs to the lower set or 𝑠 belongs to the upper set (with the rest being obvious, allowing the upper or lower set to be empty to give ±∞). And then, writing the endpoints as 𝑎 and 𝑏, given any 𝜀 > 0, and any Dedekind real 𝑥, we have 𝑥 < 𝑎, 𝑎 − ⅕𝜀 < 𝑥 < 𝑎 + ⅕𝜀, 𝑎 < 𝑥 < 𝑏, 𝑏 − ⅕𝜀 < 𝑥 < 𝑏 + ⅕𝜀, or 𝑥 > 𝑏, meaning that each point in the Dedekind real number line belongs to at least one of the interior of 𝐼, the exterior of 𝐼, and a collection of (in this case two) open intervals whose total length converges to a Dedekind real less than 𝜀 (specifically to ⅘𝜀).
Graham Manuell said:
Toby Bartels said:
Constructive analysis knows a lot of discontinuous functions that are defined almost everywhere, and I'd like to integrate those too.
I haven't given it enough thought yet whether this is possible in the approach I am advocating for, but in Simpson's approach perhaps they they can be understood as maps out of the smallest sublocale of full measure?
Perhaps; I don't even know if there is a smallest sublocale of full measure, but it seems reasonable. You can treat common discontinuous functions (such as a square wave or a binning/bucketing function) up to almost-everywhere equality as continuous maps on an ad-hoc sublocale; and besides, you can treat any integrable function (or generalized function) up to almost-everywhere equality as a distribution (as Mike remarked), so I don't mean this as an argument against working with locales. It's just that in measure theory, we really don't want to restrict attention to continuous functions, so we do have to check that we can handle the others.
Toby Bartels said:
I don't mean this as an argument against working with locales. It's just that in measure theory, we really don't want to restrict attention to continuous functions, so we do have to check that we can handle the others.
There's no guarantee that you'll have any discontinuous functions to work with even on the Dedekind reals in constructive mathematics. One version of Brouwer's continuity principle says that every function on the Dedekind reals is continuous and that principle is consistent with intuitionistic logic. If you want discontinuous functions on your Dedekind real numbers, then assume the [[analytic LPO]].
Madeleine Birchfield said:
Toby Bartels said:
I don't mean this as an argument against working with locales. It's just that in measure theory, we really don't want to restrict attention to continuous functions, so we do have to check that we can handle the others.
There's no guarantee that you'll have any discontinuous functions to work with even on the Dedekind reals in constructive mathematics. One version of Brouwer's continuity principle says that every function on the Dedekind reals is continuous and that principle is consistent with intuitionistic logic.
But there are lots of discontinuous functions that are defined almost everywhere, as I remarked upthread, and this is all that you need for measure theory. For example, the Heaviside step function is defined on (at least) every real number apart from 0, which is almost all of them. (To be precise, for each 𝜀 > 0, for some collection 𝒞 of open intervals (with rational endpoints if you like) whose total length converges to less than 𝜀, for each 𝑥, the function is defined at 𝑥 or 𝑥 belongs to at least one of the intervals in 𝒞. In this case, we can take just one interval, ]−⅓𝜀, ⅓𝜀[.)
Toby Bartels said:
But there are lots of discontinuous functions that are defined almost everywhere, as I remarked upthread, and this is all that you need for measure theory.
Well those are partial functions and are automatically discontinuous by virtue of not being defined on the entire domain. Because you are already using subsets of the real numbers to define partial functions, in locale theory it would make sense to use some sublocale of the real numbers for defining corresponding partial functions which are discontinuous but defined almost everywhere.
Do measure theorists use "function" as a synonym for "partial function"?
I might get rid of this sentence from [[measurable locale]]
"Simpson’s theory of sigma-locales, see below, is developed in classical framework, but could be a good starting point for a constructive theory, as it avoids the focus on complements. (But the main problem is completeness, not complements.)"
since I don't think the nLab should take an editorial stance on these issues, and this sentence seems to be one of the source of @James E Hanson's comments earlier in this thread.
Madeleine Birchfield said in part:
Do measure theorists use "function" as a synonym for "partial function"?
No, but they use ‘function’ to mean a function up to almost-everywhere equality. If you give them a partial function defined almost everywhere, then they'll immediately replace it with a function defined everywhere (if they're not constructivists) but refuse to tell you which one because it doesn't matter. If you're a constructivist and you tell them that you'd just as soon leave it undefined on a null set, then they'll shrug and not care even if they don't understand your reasons. If you're a constructivist measure theorist and leave your own function undefined on a null set, the other measure theorists probably won't even notice.
And a good thing too! Because it's up to the constructivists to make constructive math that works, not to close our eyes and say ‘Discontinuous functions don't exist, so I don't have to deal with them!’. Discontinuous functions do exists constructively, just not from ℝ to ℝ, and we need to be able to handle them. Since the ones of interest are defined almost everywhere, it's not a problem.
(Actually, if you give a measure theorist a partial function that's not defined almost everywhere and say that you want to integrate it, they'll handle that too, by defining it to be 0 outside its original domain. But that's a different matter, and it doesn't work for an arbitrary codomain.)
Madeleine Birchfield said:
I might get rid of this sentence from [[measurable locale]]
"Simpson’s theory of sigma-locales, see below, is developed in classical framework, but could be a good starting point for a constructive theory, as it avoids the focus on complements. (But the main problem is completeness, not complements.)"
since I don't think the nLab should take an editorial stance on these issues, and this sentence seems to be one of the source of James E Hanson's comments earlier in this thread.
The nLab takes lots of editorial stances, but this one's a bit muddled, because @Bas Spitters added the bit about how Simpson's σ-locales don't need complements, and I added the bit about how complements aren't really the source of the problem. I wouldn't mind getting rid of the whole thing, or perhaps replacing it with a note that σ-locales are a different approach that shouldn't be confused with measurable locales just because they both have ‘locale’ in the name. (And if I'm not mixed up, neither of these was intended as a constructive approach, even if constructivists who like them for other reasons might like to fix that.)
Thanks a lot guys for carrying on this wonderful discussion during the start of the semester! I really appreciate it! :folded_hands:
And @Mike Shulman is right about distribution theory and physicists' tricks :big_smile: in general ...
The fact is: when you are dealing with an open issue in mathematical physics, everything helps everything! Mathematical physics is the land of creativity and ingenious non-standard solutions ... :smirk:
Smearing included :heart_eyes: :innocent:
Mike Shulman said:
James E Hanson said:
Physicists (and other appliers of mathematics) use and integrate discontinuous functions all the time.
Physicists also use and integrate things all the time that aren't even functions, like "-functions". These can be defined mathematically as distributions, elements of the dual space of a space of smooth (and in particular continuous) functions. Discontinuous functions can also be considered (up to almost-everywhere equality) as particular distributions, even if the only actual "functions" are continuous. I don't know what the theory of distributions looks like constructively, but it would be one approach to make sense of the discontinuous functions used in applied mathematics, which would generalize the accepted way of making sense of the non-functions used in applied mathematics.
I used to be in a physics PhD program so I am intimately familiar with this. That said, one of the big issues with distributions is that one often wants to multiply them, but this doesn't really make sense mathematically. (Moreover the fact that it doesn't make sense really is a problem in applications like QFT.) Discontinuous functions and traditional measure theory do make sense mathematically though, so I don't really see the point in relegating discontinuous functions to a more exotic and poorly behaved class of mathematical objects than they need to be.
Yes, it's true that multiplying distributions is fraught in general. I think there are certain contexts in which you can do it, which might include discontinuous functions. But I think Toby's suggestion to regard discontinuous functions as just almost-everywhere-defined is even better.
Graham Manuell said:
@James E Hanson I'm currently working on a paper with @Peter Faul on how to handle 'discontinuous' functions of the kind needed for things like shock waves. I don't think they would actually pose an issue for integration in this system either. I guess I don't really care whether you think it changes the content or not. In so far as it differs, why are you so certain that what the other approaches do is correct? I do care that it can model all physical phenomena, but I believe that with sufficient care it can. As @Mike Shulman notes, distributions should also be able to be accommodated.
I'm not saying that you cannot make an approach involving locales work, but I am saying that I don't see any reason for starting with locales when in probability theory the original topology of a given measure space is entirely irrelevant and in functional analysis the original topology is only loosely relevant. I would argue that really in that case it's the metric structure that is relevant (since you can't define, e.g., derivatives in purely topological terms) and so something like Nik Weaver's metric measure spaces (which can be given a point-free description easily, as I believe Weaver does in his book in terms of Lipschitz algebras) do a far better job of abstracting the relevant structure than a topology-focused approach like that of locales.
In fact, I would argue that locales are just a bad approach to numerical analysis in general since the category of locales doesn't compute product topologies of all metric spaces correctly.
Mike Shulman said:
Yes, it's true that multiplying distributions is fraught in general. I think there are certain contexts in which you can do it, which might include discontinuous functions. But I think Toby's suggestion to regard discontinuous functions as just almost-everywhere-defined is even better.
I agree. It's a perfectly reasonable approach and lines up well with how things like step functions behave in computational contexts (e.g., there is a program that computes the Heaviside step function but it only converges on inputs that are apart from ).
Honestly, the -locale approach to measure theory feels like that somebody asked "what is the most general structure we can define a countably additive measure on?" and the answer turns out to be a -frame, and then began developing measure theory on this level of generality.
James E Hanson said:
Physicists (and other appliers of mathematics) use and integrate discontinuous functions all the time.
Indeed! They just have the good sense not to evaluate discontinuous functions at a point of discontinuity and expect the answer to be useful.
Mathematical physicists (a breed of mathematician, not physicist) formalize such ideas using functional analysis. We almost never use arbitrary real or complex-valued functions: instead, we work with them in a 'typed' manner. That is, we work with many different 'function spaces', like spaces and Sobolev spaces, and keep track of what we're allowed to do to elements of these various spaces. E.g. we can define
if is in and is measurable. We can differentiate any element of , but we get an element of the Sobolev space , which is a particular sort of distribution.
For most of these function spaces, the elements are not really functions in the traditional sense: they could be distributions, or they could be equivalence classes of functions. And in the latter case, it's often (though by no means always) true that changing the value of a function at just one point does not change its equivalence class. Then evaluation at a point is undefined. And this is fine, because we are using these functions for other things: we have no intention of evaluating them at a point!
I was thinking about something yesterday: if you try to do ergodic theory constructively, you are mixing together something that is very natural to see with category-theorist's eyes (dynamical systems) and something not-so-easy (measure theory).
I'd be curious to see what can be done and what comes out of it, e.g. what the def'n of ergodicity becomes.
Mike Shulman said:
Yes, it's true that multiplying distributions is fraught in general. I think there are certain contexts in which you can do it, which might include discontinuous functions. But I think Toby's suggestion to regard discontinuous functions as just almost-everywhere-defined is even better.
Here is the standard approach: https://ncatlab.org/nlab/show/Colombeau+algebra
Toby Bartels said:
It's just that if the interval is located, then it has a (possibly infinite) Dedekind real as both lower and upper endpoint (which depending on what one means by ‘interval’ one might assume anyway); and any point (Dedekind real) apart from the endpoints is either in the set or in its exterior (both open sets, so you can tell which). If ‘almost every’ means anything constructively in a pointwise context, then that includes everything distinct from a given two points.
Ah, thank you. I had forgotten what exactly what located meant here. I will think about this further.
I don't even know if there is a smallest sublocale of full measure, but it seems reasonable.
Simpson proves this in his paper (at least when the measure of the space is finite).
James E Hanson said:
I'm not saying that you cannot make an approach involving locales work, but I am saying that I don't see any reason for starting with locales when in probability theory the original topology of a given measure space is entirely irrelevant and in functional analysis the original topology is only loosely relevant.
This seems to be the crux of our disagreement, because I think the topology is absolutely crucial and so I am never going to be happy with an approach where it is ignored. I gave a number of reasons why I think this above, but I suppose you did not find them convincing. If you do not buy that the topology is important, I can see why you wouldn't like this approach, but for me this is the entire point.
In fact, I would argue that locales are just a bad approach to numerical analysis in general since the category of locales doesn't compute product topologies of all metric spaces correctly.
This is completely backwards in my opinion. You are saying that just because the localic product disagrees with the topological product it is 'incorrect', but I would say it is the topological product that is incorrect! You really need to give an argument why you think the topological product is the correct one. The localic one is the one with all the good properties in terms of compactness, etc. (Also, if you equip a discrete locale with a metric, then finite products will agree with the spatial ones. You seem to not care about the topology, so this is probably what you want.)
Thanks @Toby Bartels for tagging me. I haven't read up on the whole discussion yet, but some quick comments:
There's a recent preprint on sigma-locales for measure theory: https://arxiv.org/pdf/2408.13911
It's also possible to develop measure theory synthetically: https://www.cambridge.org/core/journals/mathematical-structures-in-computer-science/article/synthetic-topology-in-homotopy-type-theory-for-probabilistic-programming/1EC63B25D83E58C51DAB8D3A0AF567EB
This can be done in a number of toposes. Condensed sets is one of them.
@fosco I'm in the middle of something else now, but Bishop and Nuber have already worked on Ergodic theory.
https://www.jstor.org/stable/24901875
https://www.ams.org/journals/tran/1972-164-00/S0002-9947-1972-0291411-3/S0002-9947-1972-0291411-3.pdf
Here's my contribution to the topic:
https://projecteuclid.org/journalArticle/Download?urlid=10.2178%2Fjsl%2F1146620162
JR mentioned [[Colombeau algebra]] as a way to multiply distributions, but I haven't seen anyone use this to prove interesting results in analysis, and the nLab article explains some reasons why. Wikipedia claims that Colombeau algebras "have found numerous applications in the fields of partial differential equations, geophysics, microlocal analysis and general relativity so far", but it doesn't link to any examples, and I think this claim is misleading. Probably some people have tried applying it to all these subjects - but it hasn't caught on, because it doesn't help much.
Schwarz's result put limits on how well we can multiply distributions in general, so most analysts try to restrict to multiplying distributions whose singularities don't get in each other's way. See section 2 of A smooth introduction to the wavefront set for a pedagogical introduction to multiplying distributions, leading up to Theorem 13, a criterion for when you can multiply distributions based on the all-important notion of [[wavefront set]]. This is more like the math I usually see when people apply multiplication of distributions to problems in nonlinear PDE, quantum field theory, etc.
John Baez said:
JR mentioned [[Colombeau algebra]] as a way to multiply distributions, but I haven't seen anyone use this to prove interesting results in analysis, and the nLab article explains some reasons why. Wikipedia claims that Colombeau algebras "have found numerous applications in the fields of partial differential equations, geophysics, microlocal analysis and general relativity so far", but it doesn't link to any examples, and I think this claim is misleading. Probably some people have tried applying it to all these subjects - but it hasn't caught on, because it doesn't help much.
There's a nice book Geometric Theory of Generalized Functions with Applications to General Relativity by a group of mathematical physicists in Vienna. These people have been applying Colombeau algebras to general relativity over the past 25 years or so, and it certainly looks like they're claiming success by saying that these structures are better suited for their purposes than mere distributions:
image.png
As far as I understand, the overarching theme here is general relativity on spacetimes with low regularity. That may sound boring and irrelevant when you first see it, but of course many actual physical models involve functions that are not smooth, such as the mass density in an idealized spacetime modelling a star.
Another person who's done a lot of work in this direction is James Vickers -- not to be confused with Steve Vickers, who's worked on constructive measure theory!
Okay, I'd really like to see if they do something practical with Colombeau theory and whether it goes beyond the usual rules for multiplication of distributions (namely, you can multiply them if the wavefront sets of the two distributions intersects transversely, or something like that - I'm no expert).
So now I have another new hobby.
I'm more interested in seeing what they prove about solutions of Einstein's equations than reading about the general theory - I once studied Colombeau algebras hoping they'd help me with quantum field theory, and I eventually decided I couldn't do anything with it. One of the authors of that book is named Michael Oberguggenberger and he wrote a paper called Solutions to semilinear wave equations of very low regularity. The abstract makes him sound like an ordinary analyst, not a Colombeau cultist, so that's good.
Right, that sounds pretty accurate given my personal impression, thanks to Michael's office just being one floor down from mine :upside_down: He's the only one among those authors not based directly in Vienna. (In practice, we haven't gotten around to talking often -- partly due to an idiosyncratic system of there being three or four different math departments in one and the same building and university, and him being at another one.)
But that's how I've known about this stuff.
Please sometime ask him about the usefulness of Colombeau theory!
Graham Manuell said:
but for me this is the entire point.
The entire point of what? If your approach to measure theory doesn't have that and are the same Banach algebra at the end of the day, then you're not doing the mathematics that most people mean when they say 'measure theory.' Obviously there are times when a case can be made that certain fundamental definitions and results in a field are 'wrong' in the sense that they are going down the wrong path, but for a big successful field like functional analysis you need really solid arguments supporting such a claim and your arguments absolutely have not established this. For example, one of your arguments was about 'functions that can be carried out in practice,' but (a) as discussed elsewhere in this thread it is actually possible to assign computational meaning to things like step functions as partial functions and (b) almost all of the familiar spaces have computational presentations as Banach algebras, which means that one can in fact work with the traditional objects in measure theory, probability theory, and functional analysis 'in practice.'
This is completely backwards in my opinion. You are saying that just because the localic product disagrees with the topological product it is 'incorrect', but I would say it is the topological product that is incorrect! You really need to give an argument why you think the topological product is the correct one. The localic one is the one with all the good properties in terms of compactness, etc.
A lot of topology is supposed to be an abstraction of the behavior of metric spaces. Addition is a continuous operation on the rationals with the ordinary metric, but it isn't continuous localically. Likewise, when I look at the direct sum of Banach spaces, the topology on it should be both the topology induced by the norm and the product topology of and , but localically it's usually not. Locales only do a good job of capturing ordinary intuition about topology in the context of locally compact spaces. This is wrong. The localic product of any two complete metric spaces is spatial (classically).
I am a topologist. I do not care about measure theory per se. I care about being able to deal with probability and integration. Anyway, it is probably not possible go into enough detail in this medium and you seem to have a distaste for any mathematics that is not 100% within the dominant paradigm.
For what it's worth, you can get spaces no problem by completing 'locales of continuous functions'. Probably those two spaces are still isomorphic. This feels like a separate point to what we were discussing before. I am also happy with step functions as partial functions. I already told you that I am writing a paper on this. General measurable functions seem like a different story though since they can be discontinuous everywhere.
You are using the wrong topology on the rationals. The rationals naturally have the discrete topology. I don't know enough about Banach spaces to comment on that example, but for any separable Banach space the localic and spatial products will agree (classically).
You are defining "ordinary intuition about topology" to mean "everything works exactly like topological spaces", without interrogating why things ought to work that way. (And even then Polish spaces would also behave like you expect.) Good luck doing point-set topology constructively; I do not envy you.
Graham Manuell said:
I don't know enough about Banach spaces to comment on that example, but for any separable Banach space the localic and spatial products will agree (classically).
I realized that I was misinterpreting a result that I had read. I believe that the results of this paper imply that every localic product of Banach spaces is spatial, regardless of size.
Graham Manuell said:
General measurable functions seem like a different story though since they can be discontinuous everywhere.
The whole point of the normal approach to measure theory is that we get actual point-set function representations of all of the elements of the completion of a given probability algebra. (This isn't even that hard to do constructively.) This is sometimes a useful thing to have. An analogously useful feature of a localic approach to measure theory would be that every element of a space like can be represented by some function-like object on the original locale, but this requires the ability to treat functions that can be discontinuous everywhere.
My point is that if your approach to measure theory isn't going to offer a feature like this, then there's not really a strong reason to start the theory with locales rather than starting with the already established point-free presentation of measure theory in terms of probability algebras (i.e., metric Boolean algebras).
Graham Manuell said:
You are defining "ordinary intuition about topology" to mean "everything works exactly like topological spaces", without interrogating why things ought to work that way.
You are putting words in my mouth. I was only talking about ordinary metric spaces (complete or incomplete), not arbitrary topological spaces. I admit that I had the wrong impression about how well locales handle topology on metric spaces and that complete metric spaces are a lot more important than arbitrary metric spaces, but I still think that with the ordinary metric is not that ridiculous of an example.
Graham Manuell said:
Good luck doing point-set topology constructively; I do not envy you.
I was never intending to do point-set topology constructively. Honestly, the fact that point-set topology and other parts of math that I use regularly (like ordinals) behave so poorly in constructive math just makes me want to stop thinking about constructive math, if anything.
James E Hanson said:
Honestly, the fact that point-set topology and other parts of math that I use regularly (like ordinals) behave so poorly in constructive math just makes me want to stop thinking about constructive math, if anything.
That's okay, not everyone has to do constructive math. I'm very happy that Graham is doing it, though.
Good evening everyone!
Where were we? :thinking: Ah locales .... :heart_eyes:
Well, I was investigating some properties of open sets in writing my Master's thesis, left alone in the library with a lot of dangerous books ... and I found out we can actually play around ... ehm make some rigorous logic out of them!
The frame of open sets ... but what is a frame? Is the dual notion to the one of local! :upside_down:
And still I haven't answered the question, I know.
The category of frames Frm should have some objects and morphisms to start with hahha
And indeed they are complete lattices A satisfying some good property, i.e.
, where
and arrows preserve joins, finite meets, top and bottom.
Your set-theoretical gut feelings :smile: are right: and are just join and meet .
There are also a couple of words here needing an explanation, namely complete and lattice - that is why I wrote down lattice theory some comments ago!
For sure we need some notion of order if we wanna say this thing is greater than this other, or bigger than the other and then we can define the meet and join of two elements, respectively - that yes, are dual of each other.
More precisely, let be an ordered set, than given two elements , their meet is such that and .
The join is left as an exercise!
When we equip an ordered set with these two binary operations, meet and join, it becomes a lattice.
Completeness? Well, as you are imagining is about having a sup and an inf for all subsets of such a lattice! And look, we have nice symbols for them: . :innocent:
It is also true that this is just one point of view, and I was curious enough to discover the categorical side of the story. What if the idempotent, commutative, monoidal structures given on by the join and the meet with their "neutral elements" top and bottom are transforming into a lattice? :smirk:
And what about this Heyting?
I guess what you're looking for is that every frame is a complete Heyting algebra.
But frame homomorphisms, unlike homomorphisms of complete Heyting algebras, do not have to preserve the Heyting implication.
Here we go ...
But @Madeleine Birchfield, what is an Heyting algebra? :mic: Let us define it once and for all!
A Heyting algebra is a Cartesian closed lattice.
For those who are not familiar with category theory, there is an order theoretic or lattice theoretic definition of a Heyting algebra but I can't seem to remember it at the moment.
hahah the perfect definition for the working mathematicians ... in CT!
Well part of the job must also be done by the audience, that will carefully look it up! :wink:
or wait for tomorrow ...
The nLab says that a [[Heyting algebra]] is a lattice with an implication such that for all , , and in , if and only if .
When the meet has a right adjoint, namely the Heyting implication, the lattice is said to be Heyting. If it also antisymmetric, it is an Heyting algebra.
Here's a question I have. Why do we need two different notions, the concepts of "frame" and "locale"?
Ah this has to do with sober spaces :heart_eyes:
There is a functor TopFrm that takes open sets into the frame of open sets and we want it to be a full embedding when restricted to sober spaces.
More generally, locales are a generalisation of topological spaces. A locale is defined as a set with an arbitrary frame , while a topological space is defined as a set with a frame of subsets of .
Frames are algebraic structures which describe how the opens behave, while locales are spaces.
The usual duality between algebra and geometry! :sunglasses:
But that is contravariant, and we like to have it covariant for the full embedding. In fact, locales extend sober spaces.
Now, a continuous function between locales and is defined as a function with a frame homomorphism . Why are continuous functions defined as such?
Something that bugs me is where to put -algebras in the picture (Boolean algebras, Heyting algebras, frames, etc.)
Peva Blanchard said:
Something that bugs me is where to put -algebras in the picture (Boolean algebras, Heyting algebras, frames, etc.)
-algebras are related to -topological spaces and -frames and -locales and -complete lattices. They all only have countable joins / unions, rather than all joins / unions.
In particular, a measurable space is a -topological space where the open subsets are also closed under countable intersections, and the -algebra of the measurable space is just the -frame of open subsets of the measurable space.
Oh interesting, thank you.
I think I have trouble formulating exactly what bugs me, but let me give it another try.
On one hand, it seems that Boolean algebras, Heyting algebras, or frames/locales are notions that could be labeled as "algebraic/geometric". E.g., Stone duality, or frame/locale duality, etc.
On the other hand, -algebras seem to target measure theory or probability theory. This seems "less algebraic/geometric" (although it could just be that I lack culture in that specific domain)
For instance, I guess we can axiomatize the concept of -algebra (e.g. a lattice with countable joins/meets), and have a category of -algebras. If we take the opposite of this category, do we get "generalized measurable spaces"? (the same way locales generalize topological spaces)
Well, locales being the opposite category of frames has a specific origin, namely the fact that a continuous function between topological spaces is one where the inverse image function on open subsets is a frame homomorphism.
It looks like the definition of a measurable function also involves the inverse image; namely a measurable function between measurable spaces is one where the inverse image function on measurable subsets is a homomorphism of lattices with countable joins/meets. So it should be possible to define point-free measurable spaces that way.
The problem with measure theory is that one also talks about measurable functions which are partial functions which are almost-everywhere-defined except for a set of measure zero. And I don't know how that fits into the point-free approach.
Madeleine Birchfield said:
Now, a continuous function between locales and is defined as a function with a frame homomorphism . Why are continuous functions defined as such?
Anyways, this answers my question above. Locales are generalisations of topological spaces, and a continuous function between topological space is a function such that the inverse image function on open subsets of , , factors into a frame homomorphism to and an injection from to the power set . And since , one usually suppresses the injection and directly writes . So a continuous function between locale should be a function with a frame homomorphism .
Next question, though I would wait for @Federica Pasqualone:
What is a filter on a lattice, and what does it mean for a filter to be completely prime?
I wonder if any people are aiming to do algebraic topology with locales. (Although perhaps it would be called something else, like "algebraic locality"!)
This nLab article seems somewhat relevant: localic homotopy theory. For example, that article gives some references that consider the fundamental group for locales.
Madeleine Birchfield said:
More generally, locales are a generalisation of topological spaces. A locale is defined as a set with an arbitrary frame , while a topological space is defined as a set with a frame of subsets of .
To define a locale we don't need a set lying around, we just have a frame.
(this is confusing because it means that a locale is "the same as" a frame, but the point is that a locale is an object of the opposite category, so the data of an object is the same, but it's useful to think of them as distinct from objects of the original category)
Madeleine Birchfield said:
Next question, though I would wait for Federica Pasqualone:
What is a filter on a lattice, and what does it mean for a filter to be completely prime?
Sorry, I was dreaming of locales with enough points at that time.... hahhahaha
Anyways, filters, ultrafilters and other amazing gadgets! :star_struck:
A filter F is just a subset of a lattice L (but actually you just need the notion of ordered set) that has to respect other two axioms:
Now there are additional notions of prime, maximal filters ... coming soon! :)
A prime filter of a lattice is a filter such that the bottom element of is not in and if the join of two elements and of is in , then either is in or is in .
Now, given a complete lattice , a completely prime filter is a prime filter such that for all subsets with canonical injection , if the join is in , then there exists an in such that is in . The points of a locale are the completely prime filters of its frame.
What does one call a prime filter on a -complete lattice where for all sequences , if the countable join is in , then there exists a natural number such that is in ?
Presumably these will correspond to the points of a -locale, but I haven't found a reference which defines the points of a -locale. The completely prime filters can't be defined in a -frame because -frames, unlike frames, don't have arbitrary subset-indexed joins.
It is actually better in my opinion to think of filters on frames as predicates satisfying certain properties, where is the initial frame, i.e. the set of truth values.
i.e.
Then a completely prime filter on a frame is a frame homomorphism , and the points of a locale are the continuous functions , where is the terminal locale whose frame is the initial frame
Madeleine Birchfield said:
What does one call a prime filter on a -complete lattice where for all sequences , if the countable join is in , then there exists a natural number such that is in ?
That's a σ-prime filter. I can only find that attested in a couple places, but it follows the usual logic for the use of a ‘σ’ prefix to generalize from a finite unions to countable unions.
Alex Simpson defined in his article Measure, randomness and sublocales a continuous function from -locales to as a -frame homomorphism .
Madeleine Birchfield said:
Presumably these will correspond to the points of a -locale, but I haven't found a reference which defines the points of a -locale. The completely prime filters can't be defined in a -frame because -frames, unlike frames, don't have arbitrary subset-indexed joins.
Now, let be the terminal -locale, whose corresponding -frame is thus the initial -frame. The points of a -locale then should be the continuous functions , or the -frame homomorphisms .
If this definition of a point in a -locale is correct, all that's left to do here is to show that -frame homomorphisms into the initial -frame are the -prime filters to confirm this statement, or show that they do not coincide to disprove this statement.
The hope is that given a definition of a point in a -locale, we can then show that the points of a -locale form a -topological space, and define the notions of sober -topological space and topological -locale.
And then we can work with the -locale of real numbers whose space of points are the Dedekind reals which are valued in the initial -frame.
Madeleine Birchfield said:
The problem with measure theory is that one also talks about measurable functions which are partial functions which are almost-everywhere-defined except for a set of measure zero. And I don't know how that fits into the point-free approach.
Note the concept of 'measure zero' shows up when dealing with measure spaces, not measurable spaces. My advisor Irving Segal came up with a point-free approach to measure spaces, which he called 'integration algebras' - you can find it in his paper Algebraic integration theory or his book Integrals and Operators. The idea is to focus not on the measure space itself but on the algebra of almost everywhere bounded integrable functions on that space, mod the equivalence relation where two functions that are equal almost everywhere count as equivalent, and write down a couple of axioms obeyed by those. This gives the definition of 'integration algebra'.
Note that working dually with commutative algebras rather than spaces is parallel to choosing to work with commutative rings rather than affine schemes, or frames rather than locales. In a way it's just a question of attitude (whether to use a category or its opposite) but in some cases the definitions are more direct if you take the algebraic rather than geometric approach. And Segal was convinced that the real point of measure theory was to study integration, so he wanted to focus attention on the algebra of things to be integrated, without needing to first think of them as functions on some space.
John Baez said:
Note the concept of 'measure zero' shows up when dealing with measure spaces, not measurable spaces. My advisor Irving Segal came up with a point-free approach to measure spaces, which he called 'integration algebras' - you can find it in his paper Algebraic integration theory or his book Integrals and Operators. The idea is to focus not on the measure space itself but on the algebra of almost everywhere bounded integrable functions on that space, mod the equivalence relation where two functions that are equal almost everywhere count as equivalent, and write down a couple of axioms obeyed by those. This gives the definition of 'integration algebra'.
There's an approach where you equip your measurable space with a σ-ideal of null sets (or a δ-filter of full sets), sometimes called an enhanced measurable space, and then require any measures to be absolutely continuous with respect to this. (A measurable space becomes an enhanced measurable space in a trivial way by requiring only the empty set to be null.) This is enough structure to define the measurable algebra (which is the Boolean σ-algebra of measurable sets modulo null sets) and when the measurable space is localizable (which is when this measurable algebra is complete).
If you restrict attention to the localizable measurable spaces, then the corresponding algebra of almost-everywhere bounded measurable functions (taking values in the measurable space of real numbers and Borel sets) up to almost-everywhere equality is a commutative von Neumann algebra, and every commutative von Neumann algebra arises in this way. In fact, we get an opposite equivalence of categories if the morphisms between enhanced measurable spaces are measurable functions up to almost-everywhere equality (eta: such that the preimage of a null set is contained in a null set).
So presumably Segal's integration algebras are a generalization of commutative von Neumann algebras, although I guess that I should look at his paper to see.
Hi @Toby Bartels! Segal didn't require any 'completeness' to his integration algebras, but he proves each one is dense in some , which of course is a commutative von Neumann algebra. I don't think he pursued the subject in the way someone with category theory training would do. Thanks for telling me about these other newer ideas.
What's a 'localizable' measure space?
John Baez said:
Note that working dually with commutative algebras rather than spaces is parallel to choosing to work with commutative rings rather than affine schemes, or frames rather than locales.
Since you brought up affine schemes, let me ask a question. Are the "points" of an affine schemes the morphisms from to ?
I would prefer to define R-points of a scheme, affine or otherwise, to be morphisms from Spec(R) to that scheme. I suppose people may just call them 'points' when R = . But there's a lot you ignore if you work only with those points.
E.g. a scheme with just one such point can have an interesting fundamental group.
John Baez said:
I would prefer to define R-points of a scheme, affine or otherwise, to be morphisms from Spec(R) to that scheme. I suppose people may just call them 'points' when R = . But there's a lot you ignore if you work only with those points.
Ah, I was thinking about the case of commutative rings, but if one were working with general commutative -algebras for a commutative ring, it would make sense to define the points of an affine scheme as the morphisms from to .
John Baez said:
Madeleine Birchfield said:
The problem with measure theory is that one also talks about measurable functions which are partial functions which are almost-everywhere-defined except for a set of measure zero. And I don't know how that fits into the point-free approach.
Note the concept of 'measure zero' shows up when dealing with measure spaces, not measurable spaces. My advisor Irving Segal came up with a point-free approach to measure spaces, which he called 'integration algebras' - you can find it in his paper Algebraic integration theory or his book Integrals and Operators. The idea is to focus not on the measure space itself but on the algebra of almost everywhere bounded integrable functions on that space, mod the equivalence relation where two functions that are equal almost everywhere count as equivalent, and write down a couple of axioms obeyed by those. This gives the definition of 'integration algebra'.
Thanks a lot for these references! :sunglasses:
John Baez said:
Note that working dually with commutative algebras rather than spaces is parallel to choosing to work with commutative rings rather than affine schemes, or frames rather than locales. In a way it's just a question of attitude (whether to use a category or its opposite) but in some cases the definitions are more direct if you take the algebraic rather than geometric approach. And Segal was convinced that the real point of measure theory was to study integration, so he wanted to focus attention on the algebra of things to be integrated, without needing to first think of them as functions on some space.
This business is getting very me ... :heart_eyes:
Ah, I was thinking about the case of commutative rings, but if one were working with general commutative -algebras for a commutative ring, it would make sense to define the points of an affine scheme as the morphisms from to .
No, I was talking about commutative rings, or schemes over Z. Here you still need the flexibility of working with R-points for all commutative rings R! One flavor of point is not enough to fully probe a scheme.
John Baez said:
No, I was talking about commutative rings, or schemes over Z. Here you still need the flexibility of working with R-points for all commutative rings R! One flavor of point is not enough to fully probe a scheme.
Is this where the notion of [[generalised element]] in a category comes from? Since the morphisms from the terminal object aren't enough, such as here in the category of affine schemes.
This is a key motivator of generalized elements, yeah. For illustration, rather famously the scheme given by the equation on has no -points! (It's possible I'm confused about how points valued in a ring, rather than a field, work; if you switch to I'm more certain this is a way of stating Fermat's last theorem.)
That is, definitely a point of that scheme with coordinates in would give rise to a map from , but the converse is not quite so clear to me, because of weirdnesses with prime ideals vs maximal ideals.
John Baez said:
What's a 'localizable' measure space?
It's a localizable measurable space, and I defined it in the parenthesis; it's one where the Boolean algebra of measurable sets modulo null sets is complete.
There is a concept of localizable measure space, and in fact it came first, but it's more complicated. It requires that in addition to the above boolean algebra (using the null sets from the measure now) being complete, the measure also has to be semifinite (which means that every measurable set with positive measure contains a measurable set with finite positive measure). These are the ones that satisfy nice theorems from analysis such as the Radon–Nikodym Theorem (which people usually only state for σ-finite measure spaces).
There are some additional properties that are useful to require that I don't understand very well, which don't give you fewer objects but ensure that you get the correct morphisms. See the work by Dmitri Pavlov summarized at [[categories of measure theory]].
John Baez said:
I would prefer to define R-points of a scheme, affine or otherwise, to be morphisms from Spec(R) to that scheme. I suppose people may just call them 'points' when R = . But there's a lot you ignore if you work only with those points.
Since a point has no part (meaning no nontrivial part but both trivial parts), I would think of a morphism from Spec(ℤ) as a point and a morphism from some other scheme as an element of a different shape. This is how the terminology is used in locale theory, but perhaps it is different in algebraic geometry.
Next in locale theory: product locales. Once we have product locales, we can define things like localic groups and localic rings.
The product of two locales and should be the locale whose frame is the coproduct of the frames and .
But what is the coproduct of two frames @Federica Pasqualone?
Toby Bartels said:
John Baez said:
I would prefer to define R-points of a scheme, affine or otherwise, to be morphisms from Spec(R) to that scheme.
Since a point has no part (meaning no nontrivial part but both trivial parts), I would think of a morphism from Spec(ℤ) as a point and a morphism from some other scheme as an element of a different shape. This is how the terminology is used in locale theory, but perhaps it is different in algebraic geometry.
It is. They say a scheme assigns to each commutative ring R a set of 'R-points', and this defines a (covariant) functor from CommRing to Set called the 'functor of points'.
In the older approach to schemes a scheme is defined as a topological space with a sheaf of commutative rings on it such that some conditions hold, but there's a slick modern approach where you start with any functor from CommRing to Set, and say it's a scheme if some condition holds!
The condition is nice, too: a functor from CommRing to Set is the same as a presheaf on the opposite of CommRing, and we say it's a 'scheme' if this presheaf is a sheaf!
Of course we need to pick a Grothendieck topology to say which presheaves are sheaves, and we pick one called the Zariski topology.
As Kevin noted, it may be really hard to study Z-points of a scheme, because this amounts to solving Diophantine equations. It's easier to study R-points when R is a field.
Kevin Carlson said:
This is a key motivator of generalized elements, yeah. For illustration, rather famously the scheme given by the equation on has no -points!
Well, it has some 'trivial' ones coming from solutions like , but apart from those it has none when .
(It's possible I'm confused about how points valued in a ring, rather than a field, work; if you switch to I'm more certain this is a way of stating Fermat's last theorem.)
Unless I'm confused, you're not confused: R-points of the affine scheme defined by some polynomial equations are the same as solutions of those equations in R.
Madeleine Birchfield said:
Next in locale theory: product locales. Once we have product locales, we can define things like localic groups and localic rings.
The product of two locales and should be the locale whose frame is the coproduct of the frames and .
But what is the coproduct of two frames Federica Pasqualone?
Coproducts in frames are a little tricky! One has to first construct a special Cartesian product, then have a look at coproducts in semilattices, consider the downsets of such a special Cartesian product modulo a relation (saturate) and then the coproduct is a collection of frame homomorphisms (each of them given as composite of three arrows) mapping in such a saturated collection.
John Baez said:
The condition is nice, too: a functor from CommRing to Set is the same as a presheaf on the opposite of CommRing, and we say it's a 'scheme' if this presheaf is a sheaf!
I don't think that's quite right: a sheaf on is only a scheme if it also has a small open Zariski cover (nlab).
John Baez said:
In the older approach to schemes a scheme is defined as a topological space with a sheaf of commutative rings on it such that some conditions hold, but there's a slick modern approach where you start with any functor from CommRing to Set, and say it's a scheme if some condition holds!
This means there are two different notions of "point of a scheme". On the newer hand, if you view a scheme as a certain functor , then as John said an -point of is an element of , or equivalently (by the Yoneda lemma) a natural transformation . But on the older hand, if you view a scheme as a topological space with a certain kind of sheaf of rings on it, then a "point of " would naturally mean a point of that topological space.
In the case when is affine, i.e. as a functor it is a representable , then "new-style" -points are ring homomorphisms , while "old-style" points are prime ideals in . Since the ideals in a ring are bijective to its quotients (up to isomorphism), with prime ideals corresponding to quotients that are integral domains, in this case the old-style points can be identified with certain of the new-style points, namely the ring homomorphisms that are surjective with an integral domain, up to isomorphism. I don't remember whether/how this generalizes to non-affine schemes.
I believe it works in a very similar way since the study of points (of either kind) can be done locally and every scheme has an open cover by affine schemes. (By a set of affine schemes. :wink:)
Mike Shulman said:
In the case when is affine, i.e. as a functor it is a representable , then "new-style" -points are ring homomorphisms , while "old-style" points are prime ideals in . Since the ideals in a ring are bijective to its quotients (up to isomorphism), with prime ideals corresponding to quotients that are integral domains, in this case the old-style points can be identified with certain of the new-style points, namely the ring homomorphisms that are surjective with an integral domain, up to isomorphism.
This is all very nice. In week205 I outlined a somewhat different take on the same ideas, which I learned from James Dolan. Quoting some snippets:
What's special about the ring of functions on a space consisting of just one point? Take real- or complex-valued functions, for example. How do these differ from the functions on a space with lots of points?
The answer is pretty simple: on a space with just one point, a function that vanishes anywhere vanishes everywhere! So, the only function that fails to have a multiplicative inverse is 0. For bigger spaces, this isn't true.
A commutative ring where only 0 fails to have a multiplicative inverse is called a "field". So, the algebraic analogue of a one-point space is a field.
This means that the algebraic analogue of a map from a one-point space into some other space:
i: {x} → X
should be a homomorphism from a commutative ring R to a field k:
f: R → k
In the geometry we learned in high school, once we see one point, we've seen 'em all: all one-point spaces are isomorphic. But not all fields are isomorphic! So, if we're trying to think of algebra as geometry, it's a funny sort of geometry where points come in different flavors!
Moreover, there are homomorphisms between different fields. These act like "flavor changing" maps - maps from a point of one flavor to a point of some other flavor.
If we have a homomorphism f: R → k and a homomorphism from k to some other field k', we can compose them to get a homomorphism f ': R → k'. So, we're doing some funny sort of geometry where if we have a point mapped into our space, we can convert it into a point of some other flavor, using a "flavor changing" map.
Now let's take this strange sort of geometry really seriously, and figure out how to actually turn a commutative ring into a space! First I'll describe what people usually do. Eventually I'll describe what perhaps they really should do - but maybe you can guess before I even tell you.
People usually cook up a space called the "spectrum" of the commutative ring R, or Spec(R) for short. What are the points of Spec(R)? They're not just all possible homomorphisms from R to all possible fields. Instead, we count two such homomorphisms as the same point of Spec(R) if they're related by a "flavor changing process". In other words, f ': R → k' gives the same point as f: R → k if you can get f ' by composing f with a homomorphism from k to k'.
This is a bit ungainly, but luckily there's a quick and easy way to tell when f: R → k and f ': R → k' are related by such a flavor changing process, or a sequence of such processes. You just see if they have the same kernel! The "kernel" of f: R → k is the subset of R consisting of elements r with
f(r) = 0
The kernel of a homomorphism to a field is a "prime ideal", and two homomorphisms are related by a sequence of flavor changing processes iff they have the same kernel. Furthermore, every prime ideal is the kernel of a homomorphism to some field. So, we can save time by defining Spec(R) to be the set of prime ideals in R.
Then later:
I've been avoiding saying the things people usually say. People usually note that a maximal ideal is the same as the kernel of a homomorphism ONTO a field, while a prime ideal is the same as the kernel of a homomorphism ONTO an integral domain. (Recall that an integral domain is a commutative ring where xy = 0 implies that x or y is zero.) If we define the "points" of a commutative ring R to be its maximal or prime ideals, we can therefore think of these as the kernels of homomorphisms from R onto fields or integral domains.
However, defining points in terms of homomorphisms ONTO a given sort of commutative ring is rather irksome, because it doesn't tell us how points transform under homomorphisms of commutative rings, nor under the "flavor-changing operations" I was describing. The problem is that the composite of an onto homomorphism and another homomorphism needn't be onto!
So, what really matters is that a prime ideal is the same as the kernel of a homomorphism TO a field. To see how this follows from the usual story, note that any integral domain is contained in a field called its "field of fractions" - just as Z is contained in Q. Any homomorphism ONTO the integral domain thus becomes a homomorphism TO this field, with the same kernel. Conversely, any homorphism TO a field becomes a homomorphism ONTO its image, with the same kernel - and this image is always an integral domain.
This story also sheds some light on why fields are useful in the study of schemes even when we don't stick fields in "by hand", e.g. by working with schemes over a specific field.
How does this process work in constructive mathematics, where there are multiple notions of fields - discrete fields, Heyting fields, etc?
That question is above my pay grade, but there may be someone here who knows.
While also not an expert, my expectation is that you can set up the story the same way but some details will change. After all, the basic set-up of algebraic geometry (treating the opposite of a category of algebraic things as a category of geometric things which can be glued together as geometers do) requires relatively few properties to work afaict, but most other versions of it are under-developed because we don't have the same motivation (from number theory, say) to study them!
In constructive mathematics I wouldn't expect integral domains, fields, prime ideals, and maximal ideals to play as central a role in algebraic geometry. For example, constructively we can't say that any ideal is contained in a maximal one or even a prime one. I expect that the correct constructive definition of the old-style space is a a locale (I think its frame of opens is something like the "frame reflection" of the quantale of all ideals in ) which may not be spatial, corresponding to the fact that may not have "enough" prime ideals. But I expect the functor of points approach would work more directly.
The other problem in constructive algebraic geometry already comes in the definition of a commutative ring. Does one work in the category of sets and functions and deal with ideals, subalgebras, etc, or does one work in the subcategory of sets equipped with tight apartness relations and apartness-reflecting functions and deal with anti-ideals, anti-subalgebras, etc? I've seen some constructive algebraists take one approach and others take the other approach.
In classical mathematics this isn't a problem because the negation of equality is a tight apartness relation.
If I wanted to work constructively I'd probably try to avoid relying on tight apartness relations, because they feel so classical - it's sort of like deciding you want to take a swim and then putting on a diving suit so you don't get wet.
Are there any commonly used rings in mathematics which do not have a tight apartness relation in constructive mathematics?
I have no idea. I don't really think about constructive mathematics much, since I'm already having too much trouble with ordinary mathematics. But it's an interesting question.
Do all of the constructive variants of the reals have a tight apartness relations?
I think @Andrej Bauer could answer that. Now that I imagine him answering it, my guess is that the answer is "no". (Thinking about a question by imagining how someone else would answer it.)
Morgan Rogers (he/him) said:
Do all of the constructive variants of the reals have a tight apartness relations?
I think the MacNeille reals do not, but they behave very poorly in constructive mathematics for various reasons, and so aren't commonly used in constructive mathematics.
The MacNeille reals are not a local ring, but they do form a [[weak Heyting field]]. So that is one example of a ring without a tight apartness relation.
Once you start doing category-theoretic things like taking quotients, apartness relations tend to disappear. E.g. in order for to have a tight apartness, I think you need to be the complement of an anti-ideal. Concretely, for instance, I think only has a tight apartness if . So if you want a good category of rings, as you probably do in algebraic geometry, you probably shouldn't require that they have tight apartnesses. Probably rings with tight apartness should be considered a class of "nice objects" sitting properly inside the "nice category" of all rings.
A more general question: is the category of sets with tight apartness relations and apartness-reflecting functions a pretopos, or is it only a coherent category?
Is it even a regular category?
hm, I'm not actually sure.
This relational enterprise is getting very interesting! :smirk:
I had no idea of the existence of a particular name for the complement of an equivalence relation! Apartness! #
It is even better than how my professor of general topology, Prof. Eraldo Giuli, introduced us to the notion of totally (path) disconnected spaces by stories of impossible loves. We all still remember the definition!
I have a lot to learn about the constructivists! :books:
If two points are not tight apart, they are equal! Brillant! :light_bulb:
About tight apartness relations, I want to address something I wrote above.
Madeleine Birchfield said:
Personally for me, I'm currently looking at weak versions of excluded middle, such as
- "given two sets and with decidable tight apartness relations, the tight apartness relation on the function set is decidable"
- "if a set has a tight apartness relation, that relation is decidable"
which would make the real numbers and the function sets and behave more as they do in classical mathematics.
If sets with decidable tight apartness relations are cartesian closed, then excluded middle holds.
The idea is to consider subsingletons; sets for which all elements and , . Subsingletons have a decidable tight apartness relation which is always false, given by the negation of equality.
The set with two elements has decidable tight apartness, also given by the negation of equality.
Now, for a subsingleton , consider the function set . There are two functions from to , which may or may not be equal to each other: the constant functions and . Now, the tight apartness relation on is defined as
The negation of this existential quantifier is the universal quantifier , which is the same as equality by function extensionality.
To say that the tight apartness relation on is decidable is to say that for the constant functions and , either or . In the first case, is inhabited and thus a singleton, and the second case is only possible if is an empty set. But this implies that is either empty or a singleton, and if this is the case for all subsingletons , then (internal) excluded middle holds, and classical mathematics follows.
Madeleine Birchfield said:
To get the true effect of classical mathematics for and I'd probably have to pair
- "given two sets and with decidable tight apartness relations, the tight apartness relation on the function set is decidable"
with something like
- "given two spatial locales and , the locale associated with the function set is spatial, where and are the spaces of points of the locales and "
Thus, there is no need to add this second axiom about locales being spatial, since that is already implied by the first axiom via excluded middle.
The LEM is back in the game! :danger:
Well, I cannot help tonight. I am busy with quasicoherent sheaves ... :heart_eyes:
Mike Shulman said:
Once you start doing category-theoretic things like taking quotients, apartness relations tend to disappear. E.g. in order for to have a tight apartness, I think you need to be the complement of an anti-ideal. Concretely, for instance, I think only has a tight apartness if . So if you want a good category of rings, as you probably do in algebraic geometry, you probably shouldn't require that they have tight apartnesses. Probably rings with tight apartness should be considered a class of "nice objects" sitting properly inside the "nice category" of all rings.
Note that a tight apartness is a structure, not a property, so not only does R/I have a tight apartness only when I is the negation of an anti-ideal, but even then, you have to pick an anti-ideal to know which apartness to use. (Similarly, rings with tight apartness don't sit nicely inside all rings as a subcategory.)
So if you're working with rings with tight apartness relations, then you want to just always work with anti-ideals. You don't even write R/I for an ideal I; you write R/A for an anti-ideal A. It's true that the complement of an anti-ideal is always an ideal, but you don't really care about those. And while it's harder to get an anti-ideal than an ideal, there are usually plenty.
There are typically more anti-prime anti-ideals and anti-maximal anti-ideals than prime ideals or maximal ideals; an anti-ideal A is anti-prime iff it's closed under (nullary and binary) multiplication, and anti-maximal iff it has exactly one inhabited sub-anti-ideal (necessarily itself). For example, in the ring of real numbers, we can't prove that the ideal {0} is prime, much less maximal; but in any Heyting field, we can prove that the set of invertible elements (the elements apart from 0) form an anti-prime and anti-maximal anti-ideal.
I agree that the rings with tight apartness are the nice objects in a larger category, but this larger category should be the category of rings with inequality (where an inequality is an irreflexive symmetric relation, and for the compatibility of inequality and multiplication, we require only that a ≠ b when ac ≠ bc, and similarly for addition). Rings are also a subcategory of these (since we can always use the denial inequality), but a mostly separate one. Of course, this is just the category of rings under the antithesis interpretation (with multiplicative connectives).
Ok @Toby Bartels, that's fair, I think I mostly agree with you. It would be interesting to see how much algebraic geometry you can do in the category of rings with inequality; the inequality adds a bit of a "topological" flavor to the algebra.
However, once we've decided to use rings with inequality, perhaps we should go all in on the antithesis interpretation and work with ideal/anti-ideal pairs, without assuming that either is the complement of the other, and weaken the axioms to be multiplicative. E.g. in a ring with inequality that isn't a tight apartness, the "zero anti-ideal" won't satisfy the usual -openness condition in the definition of an anti-ideal, nor will its complement be ; but the pair satisfies the "multiplicative openness" condition that .
Mike Shulman said:
Ok Toby Bartels, that's fair, I think I mostly agree with you. It would be interesting to see how much algebraic geometry you can do in the category of rings with inequality; the inequality adds a bit of a "topological" flavor to the algebra.
However, once we've decided to use rings with inequality, perhaps we should go all in on the antithesis interpretation and work with ideal/anti-ideal pairs, without assuming that either is the complement of the other, and weaken the axioms to be multiplicative. E.g. in a ring with inequality that isn't a tight apartness, the "zero anti-ideal" won't satisfy the usual -openness condition in the definition of an anti-ideal, nor will its complement be ; but the pair satisfies the "multiplicative openness" condition that .
Yes, I agree! And the nice thing about this is that it lets us have both weak and strong versions of definitions at every stage. So an antithesis ideal is a pair satisfying rules like the multiplicative openness that you mentioned, while a strong ideal satisfies the additive openness rule of traditional anti-ideals. But then in either case, we can consider whether it's merely prime, satisfying etc, or strongly prime, satisfying (which is not required of traditional anti-prime anti-ideals) etc.
For those wondering what the antithesis interpretation is, Mike Shulman wrote a paper about the topic:
https://arxiv.org/abs/1805.07518
Is there an antithesis interpretation of constructive measure theory?
Mike Shulman said:
Ok Toby Bartels, that's fair, I think I mostly agree with you. It would be interesting to see how much algebraic geometry you can do in the category of rings with inequality; the inequality adds a bit of a "topological" flavor to the algebra.
For what it's worth, I did my PhD thesis on the actual 'topological situation' of finding a spectrum for localic rings. See also this paper. (Though there is still a lot of work to do. Most importantly, finding conditions under which the structure bundle exists.) I have wondered before about whether the case of apartness rings fits into this setting, but it requires a deeper understanding of the topology on apartness spaces than I have at present.
Also, to answer some questions above, the points of the Zariski spectrum of a ring are the prime anti-ideals and the frame is indeed the localic reflection of the quantale of ideals. You can think of the fields you take values in as being Heyting fields, but it is better to say you are taking values in local rings equipped with the apartness relation: iff is invertible.
Madeleine Birchfield said:
Is there an antithesis interpretation of constructive measure theory?
Almost surely. (-:O
Madeleine Birchfield said:
Is there an antithesis interpretation of constructive measure theory?
The difficulty with answering that question is that it presupposes a particular way of writing down the basic definitions of the subject, and we've seen half a dozen ways to do that already in this thread. As an interpretation, it's an interpretation of linear (or affine) logic within constructive mathematics, so we need to have particular statements in the language of linear logic to interpret.
So for me, it's useful more as a point of philosophy, that whenever we have a statement, we should try to identify its antithesis and keep track of both. That's an idea that long predates Mike's paper, really, but Mike formalized it and found the connection to linear logic. And by checking against his formalism, I can see if I forgot something, or if I can get a new idea by switching between additive and multiplicative connectives.
Yes, you are right @Toby Bartels we wrote down more or less all the possible known alternative definitions. :sunglasses:
There is material for a survey paper! :big_smile:
Back to locales. We've defined what the terminal locale is and what product locales are. Since we have products, we have diagonals , which are codiagonals in the category of frames .
A locale is discrete if the diagonal is an open map and the unique map into the terminal locale is an open map.
In classical mathematics a topological space is discrete iff the diagonal is an open map. I'm unfamiliar with the extra condition that the map into the terminal object be open, since that's automatic for topological spaces (at least classically: I'm a classical guy). But it's a natural partner to the condition on the diagonal so it's very interesting: there seem to be two kinds of discreteness. What kind of locales are ruled about by the condition that the map to the terminal locale should be open?
The locales in which the unique map into the terminal locale is open are called "overt locales" or "locally positive locales". In classical mathematics, every locale is overt.
We are back guys ... :joy: I was missing you all!
Thanks, Madeleine. The term "overt" is helpful, but I'm trying to get an intuition for what non-overt locales are like. How should I think about them?
I'm not sure, since I don't really have the intuition for non-overt locales either, but I think those who are more well-versed in constructive locale theory like @Graham Manuell might be able to help you here.
Another thing about discrete topological spaces and discrete locales:
DIscrete topological spaces are those topological spaces whose frame are opens is the power set , which means that the power set is itself a locale. This also could be shown directly, since is the initial frame, and any function set whose codomain is a frame is also a frame, and thus can be interpreted as locales.
Also I think the completely prime filters of are the singleton subsets of , and that is a discrete locale, but I'm not entirely sure of how to prove the latter statement.
The nLab has a page about [[overt spaces]] for whatever that's worth. The intuition I find most helpful is that an overt locale is one that's "locally inhabited", which is a rung below the bottom of the usual ladder from "locally connected" to "locally simply connected" and eventually "locally contractible" at the top. Just a a locally connected space is one that can be covered by connected open subspaces, an overt space is one that can be covered by inhabited open subspaces (although in locale theory the notion of "inhabited" is somewhat subtle, since an "inhabited" space might still not have any points, and so we call it "positive" instead).
(To forestall any confusion, note that although the empty space is neither connected nor inhabited, it is both locally connected and locally inhabited, i.e. overt, since it can be covered by a collection of zero connected/inhabited open subspaces.)
The set of true propositions is also a frame, the terminal frame and thus the initial locale, and the indiscrete topological spaces are the ones whose frame of opens is the function set , which is frame-homomorphic to , and thus is not sober, for non-singleton .
Madeleine Birchfield said:
Also I think the completely prime filters of are the singleton subsets of , and that is a discrete locale, but I'm not entirely sure of how to prove the latter statement.
If this is true, then I think that the functor which takes a locale to its set of points has a left adjoint given by the powerset functor .
Then, there are technically multiple locales of real numbers. There is the usual [[locale of real numbers]] defined by some localic completion of the rational numbers. Then there is the discrete locale of real numbers, which is given by the powerset frame of the Dedekind real numbers .
@John Baez Let X be a set and let S be a subset of X that is not decidable. S is an open sublocale of X and its complementary closed sublocale has open diagonal, but is not overt. If it were overt it would be a set, but S doesn't have a complimentary subset by assumption.
Overt locales are the locales you can existentially quantify over. They are the locales for which if is an open of then is open. In other words, the product projection is open when is overt.
I would not say the underlying set of is a 'locale of reals'.
John Baez said:
In classical mathematics a topological space is discrete iff the diagonal is an open map. I'm unfamiliar with the extra condition that the map into the terminal object be open, since that's automatic for topological spaces (at least classically: I'm a classical guy). But it's a natural partner to the condition on the diagonal so it's very interesting: there seem to be two kinds of discreteness. What kind of locales are ruled about by the condition that the map to the terminal locale should be open?
A space with the property that the unique map to the terminal space is an open map, is an overt space (originally called open, and also called locally positive). For topological spaces in the standard sense taught to us by Bourbaki, they're all overt, rather trivially (and constructively). For locales, it's less trivial, but you can still prove it classically. But constructively you can't; so only in constructive locale theory (and other more outré alternatives) is this even an issue.
Paul Taylor constructs an example of a non-overt space (more precisely, one that cannot constructively be proved overt) making use of a real number whose equality with can't be decided. (Such numbers are in turn routinely constructed as infinite series using unproven conjectures in number theory, or using the halting problem.) If is such a number, then and are discrete subspaces of the real line, but their intersection (which classically is either or , both again discrete), cannot be proved overt.
Taylor is the one who popularized ‘overt’ rather than ‘open’ (since he wanted to talk about when a subspace of an overt space is overt, but the term ‘open subspace’ means something completely different). Taylor also uses ‘discrete’ to mean only that the diagonal map is open, which is non-standard, but I don't know any other term for just this property. Compare that a space is hausdorff iff the diagonal map is proper and that a space is compact iff the terminal map is proper; like discreteness (in Taylor's sense) and overtness, hausdorffness and compactness often come together but can also be considered separately.
Recall that a space is compact iff, given any space and open subspace of the product , for some (necessarily unique) open subspace of , for each (open) subspace of , is contained in iff is contained in . In other words, if you think of an open subspace as giving us a verifiable property, then is compact if and only if we can always get another verifiable property by universally quantifying over .
Dually, is overt iff, given any space and open subspace of the product , for some (necessarily unique) open subspace of , for each (open) subspace of , is contained in iff is contained in . In other words, is overt if and only if we can always get another verifiable property by existentially quantifying over . (Graham Manuell just said this.)
ETA: The compact spaces of Bishop-style constructive analysis generally become compact overt spaces of constructive localic analysis. Much of what Bishop does with locatedness is, from the localic perspective, really about overtness.
Oops. Fixed
Graham Manuell said:
I would not say the underlying set of is a 'locale of reals'.
Would you say that the topological space equipped with the discrete topology is not a 'space of reals', or that the flat modality of the cohesive -groupoid of Dedekind reals with its Euclidean topology is not a 'cohesive -groupoid of reals'?
What makes locales different from topological spaces or cohesive -groupoids that warrents restricting the notion of a 'locale of real numbers' to that whose opens are generated by rational open intervals?
I would probably restrict the phrase "space of reals" or "locale of reals" to the unique one that has the correct topology. The set of reals with any other topology I would refer to as something like "a space whose points are the real numbers" rather than the ambiguous "a space of reals".
I would (grudgingly?) accept the space of reals, but only when equipped with its topology. I wouldn't call a space of reals.
Toby Bartels said:
Taylor also uses ‘discrete’ to mean only that the diagonal map is open, which is non-standard, but I don't know any other term for just this property. Compare that a space is hausdorff iff the diagonal map is proper and that a space is compact iff the terminal map is proper; like discreteness (in Taylor's sense) and overtness, hausdorffness and compactness often come together but can also be considered separately.
Some people (e.g. algebraic geometers) use "compact" to mean what the rest of us call "compact hausdorff", and "quasi-compact" to mean what the rest of us call "compact". So perhaps "quasi-discrete" would be a good name for spaces with open diagonal.
Mike Shulman said:
I would probably restrict the phrase "space of reals" or "locale of reals" to the unique one that has the correct topology. The set of reals with any other topology I would refer to as something like "a space whose points are the real numbers" rather than the ambiguous "a space of reals".
So then, in real-cohesive HoTT, what would you call ? "A cohesive type whose points are the real numbers"?
@John Baez We should probably have mentioned that there is a connection to the other discussion you are having about local homomorphisms. Those need to be open and have open diagonal in the pullback. They correspond to discrete locales in the sheaf topos over the base. Without overtness you lose the condition that the map must be open.
And what about localic topoi? :heart_eyes:
@Madeleine Birchfield Sober spaces are always behind the curtains - since you mentioned me earlier.
Madeleine Birchfield said:
So then, in real-cohesive HoTT, what would you call ?
Usually I just call it .
Mike Shulman said:
Some people (e.g. algebraic geometers) use "compact" to mean what the rest of us call "compact hausdorff", and "quasi-compact" to mean what the rest of us call "compact". So perhaps "quasi-discrete" would be a good name for spaces with open diagonal.
That could work, although the ‘quasi’ is going in the direction. For a perfect analogy, ‘quasi-discrete’ would mean ‹overt› instead.
Mike Shulman said:
I would probably restrict the phrase "space of reals" or "locale of reals" to the unique one that has the correct topology. The set of reals with any other topology I would refer to as something like "a space whose points are the real numbers" rather than the ambiguous "a space of reals".
Graham Manuell said:
I would (grudgingly?) accept the space of reals, but only when equipped with its topology. I wouldn't call a space of reals.
So would you say that the "space of lower reals" is the (lower) reals equipped with the lower semi-continuity topology, and the "space of upper reals" is the (upper) reals equipped with the upper semi-continuity topology, and thus even in classical point-set topology, one would have to make a distinction between the "space of lower reals", the "space of upper reals", and "the space of reals"?
Federica Pasqualone said:
And what about localic topoi? :heart_eyes:
Somebody more familiar with Grothendieck topos theory will probably have to answer that question, or you can do so yourself.
I really enjoyed all these comments, which give me some intuition for overtness. I especially enjoyed Toby reminding me of the characterization of compactness (resp. Hausdorffness) in terms of properness of the unique map to the terminal space (resp. diagonal map).
(Is the opposite of an overt space a covert space? :crazy:)
John Baez said in part:
(Is the opposite of an overt space a covert space? :crazy:)
We call a space covert if its terminal map (the unique map to the terminal space) is closed. This is a less important concept, and (like overtness) trivial for topological spaces; I forget if it's also classically trivial for locales.
For topological spaces, you can also define hausdorffness by requiring the diagonal to be closed, rather than proper. But for locales, even classically, this is a different notion, and again less important. (I don't know if it has a name.)
Madeleine Birchfield said:
Federica Pasqualone said:
And what about localic topoi? :heart_eyes:
Somebody more familiar with Grothendieck topos theory will probably have to answer that question, or you can do so yourself.
I thought it was a good topic to have the full picture of this enterprise .. a localic topos is equivalent to sheaves on a locale, I found it very interesting ...
Equivalently, one can say that a Grothendieck topos is localic iff there exists a site for it with a poset as underlying category. :upside_down:
Food for thoughts, guys ...
John Baez said:
I really enjoyed all these comments, which give me some intuition for overtness. I especially enjoyed Toby reminding me of the characterization of compactness (resp. Hausdorffness) in terms of properness of the unique map to the terminal space (resp. diagonal map).
(Is the opposite of an overt space a covert space? :crazy:)
A psychological characterization of spaces hahah :zany_face:
Federica Pasqualone said:
I thought was a good topic to have the full picture of this enterprise .. a localic topos is equivalent to sheaves on a locale, I found it very interesting ...
Yeah, it's just that I don't know much about Grothendieck toposes at all, other than
Well, I am also still learning. After all, it is an elephant. Isn't it?
Here it is how it works for me:I have always in mind the topological picture 'cause it helps my geometrical mind to imagine and remember the notions.
We take a small category together with a cover (satisfying what you expect a cover should satisfy). It is just topology! :heart_eyes: But now we call it topos, with its site and Grothendieck topology J. And look, the cover is a family of maps - it is not a bunch of open sets - and ta tan! The theory gets so general and so powerful! A rabbit out of the hat! It's magic! :magic:
Given a locale , what is the frame of opens of the locale ?
Like, the locale is the terminal coalgebra of the endofunctor , which means that the frame of opens is the initial algebra of the endofunctor in the category of frames, but I don't know what such a frame would be topologically.
Also, what is the universal property of the [[locale of real numbers]], in the category of locales?
Toby Bartels said:
John Baez said in part:
(Is the opposite of an overt space a covert space? :crazy:)
We call a space covert if its terminal map (the unique map to the terminal space) is closed. This is a less important concept, and (like overtness) trivial for topological spaces; I forget if it's also classically trivial for locales.
For topological spaces, you can also define hausdorffness by requiring the diagonal to be closed, rather than proper. But for locales, even classically, this is a different notion, and again less important. (I don't know if it has a name.)
I thought the standard definition of Hausdorffness for locales is that the diagonal is closed. This definition appears in Handbook of categorical algebra by Francis Borceux. I also think it works pretty well constructively. Do you have a reference for the other definition?
Valery Isaev said:
Toby Bartels said:
For topological spaces, you can also define hausdorffness by requiring the diagonal to be closed, rather than proper. But for locales, even classically, this is a different notion, and again less important. (I don't know if it has a name.)
I thought the standard definition of Hausdorffness for locales is that the diagonal is closed. This definition appears in Handbook of categorical algebra by Francis Borceux. I also think it works pretty well constructively. Do you have a reference for the other definition?
Actually, my mistake was saying that they were different. (There are a lot of minor differences in conceptions of hausdorffness, and I thought that this was one of them, but I guess it's not.) Of course, every proper map is closed (and so every compact space is covert), but not conversely. However, at [[Hausdorff space]] it says that every closed embedding is proper and so these definitions are equivalent after all. (But it doesn't give a proof or reference for this fact.)
It's easy to see that closed inclusions are proper by direct computation. In fact, the right adjoint preserves all inhabited joins.
Madeleine Birchfield said:
Given a locale , what is the frame of opens of the locale ?
I'm not sure I understand the question. It's just the inifinite coproduct of the underlying frame of with itself. I think you want something more than this, but I don't know if there is anything more to say.
Graham Manuell said:
Madeleine Birchfield said:
Given a locale , what is the frame of opens of the locale ?
I'm not sure I understand the question. It's just the inifinite coproduct of the underlying frame of with itself. I think you want something more than this, but I don't know if there is anything more to say.
I think I did - for example, the frame of opens for the Baire space locale is just the infinite coproduct of the underlying frame of the discrete locale of the natural numbers with itself. How does that relate to the usual definition of the frame of opens for the Baire space locale in terms of lists of natural numbers?
They are the same. is always the same as lists of 's. Perhaps it'd help to know the infinite product is the same as the exponential object with the discrete locale ?
Graham Manuell said:
is always the same as lists of 's.
I should specify that I'm talking about finite lists of , denoted , which is the initial algebra of and the free monoid of , while the sequence space is the terminal coalgebra of .
i.e. an open in Baire space is a subset of the set of lists of natural numbers such that:
For all , if , then for every , the concatenation ;
For all , if for every natural number the concatenation , then ;
The set of all opens as defined above forms a frame with respect to subset inclusion and is thus a locale; it just so happens that the completely prime filters of this frame are isomorphic to the sequence set .
How does this definition of the frame of opens of Baire space relate to the categorical definition of Baire space as the exponential object in the category of locales, or equivalent as an infinite coproduct in the category of frames?
This definition seems to be also just for discrete ; what if is not discrete?
What is the difference between the locales where is the locale of real numbers, and where is the discrete locale of the Dedekind real numbers? As a discrete locale we'll also use for the underlying set of Dedekind real numbers. The latter is presumably constructed out of opens defined as a subset of the set of lists of Dedekind real numbers such that:
For all , if , then for every , the concatenation ;
For all , if for every natural number the concatenation , then ;
But I'm not sure if this definition would work for the former: what exactly is a list in the locale of real numbers, as opposed to a list in the Dedekind real numbers? Of course we have the initial algebra of the endofunctor in the category of locales, but does it even make sense to use the above definition to define opens in ?
You can always take basic opens for the product that are finite lists of basic opens from the factors. I think you are getting confused by the fact that for a discrete locale the singletons form a base. In general the points won't play any role.
Toby Bartels said:
For topological spaces, you can also define hausdorffness by requiring the diagonal to be closed, rather than proper. But for locales, even classically, this is a different notion, and again less important. (I don't know if it has a name.)
In the Elephant, Johnstone uses the term "separated" in generalizing the "diagonal is proper" property to general geometric morphisms
How to mimic the property of a topological space for locales? :thought: :thinking: @Valery Isaev
The answer was given by John Isbell: A top. space X is said to be iff is closed. Well, let us say that a frame L is - or better I- (= Isbell-Hausdorff aka strongly Hausdorff frame) if is a closed localic map.
Moreover, if is I- then X is .
P.S. Good evening everyone! :sunglasses:
Madeleine Birchfield said:
Chris Heunen said:
But for that theorem you in fact don't need Soler! Matthew Di Meglio and I figured out how to relate categorical colimits to analytic limits of real numbers directly, which is much more satisfying. See also Matt's nice blog post on the process of how that came to be.
Well here the issue is that I'm not aware of anybody who has proven that DeMarr’s theorem, which replaced Soler's theorem, holds in constructive mathematics.
I suspect that DeMarr's theorem won't hold in constructive mathematics because there are multiple Archimedean ordered Heyting fields (which are partially ordered semifields) which are sequentially complete and thus are monotone sequentially complete, but are not provably equivalent to each other.
And there are axiom sets that one can add to constructive mathematics to show that there are two provably inequivalent sequentially complete Archimedean ordered fields: the LPO and Brouwer's continuity principle for Dedekind reals show that the Escardo-Simpson reals (the initial sequentially compete Archimedean ordered field) are inequivalent to the Dedekind reals (the terminal sequentially compete Archimedean ordered field) - the former is a discrete field with LPO while the latter is provably not a discrete field with Brouwer's continuity principle.
It's been a while, but yes, one should definitely use stronger conditions to try to get a DeMarr-like theorem constructively, of any sort, just as an exercise (even if with a weaker conclusion). I'm only catching up now, so apologies is this was discussed subsequently, and for bringing up such an old topic.
For what it's worth, and this is also an old topic: but a vast chunk of mathematics that is needed for physics is encodable in a strongly finitist foundation, by work of Feng Ye. I don't know how this interacts with taking non-classical foundations for stronger systems like constructive measure theory. But it seems to me in principle that just as one can encode (Cauchy) reals as fast-converging Cauchy sequences with a given modulus of convergence, having finitist approximations to the raw ingredients of the mathematics for theoretical physics perhaps with some kind of effectivity/uniformity built in feels closer to constructive than otherwise.
Graham Manuell said:
I would (grudgingly?) accept the space of reals, but only when equipped with its topology. I wouldn't call a space of reals.
Did you mean ? We have , I thought. (EDIT: has been fixed)
David Michael Roberts said:
Did you mean ? We have , I thought.
Yep. Fixed
Let's return back to the concept of predicativity.
Madeleine Birchfield said:
This really depends on the predicativist and their beliefs. There are many people calling themselves predicativists who would accept anything and everything so long as the things they assume don't yield power sets. They would accept say the induction principle for the natural numbers or the induction principle for binary trees, since to them impredicativity = power sets, and neither induction principle proves that power sets exist.
Apparently, dependent type theorists also use impredicativity to mean that universes are closed under -indexed products of types / sets in . Here then, we have a three way contrast between the different notions of predicativity / impredicativity, depending on what dependent products are -small.
There is the predicativity in the sense that @Federica Pasqualone is talking about, that for a universe the natural numbers is -small but the induction principle of the natural numbers for families of -small types is -large, which implies that is not closed under dependent products of -small types indexed by a -small type. Dependent products of families of -small types indexed by a finite type are still -small, though.
Then there is the usual sense of predicativity in dependent type theory, where is closed under dependent products of -small types indexed by a -small type, but which is still impredicative in the first sense.
Then there is the impredicativity in dependent type theory where is closed under dependent products of -indexed families of -small types.
This is about all -small types, we get another four possible versions by restricting to the -small h-propositions:
There is the predicativity in the actual sense that @Federica Pasqualone is talking about since she works in first-order logic rather than dependent type theory, that for a universe the natural numbers is -small but the induction principle of the natural numbers for families of -small h-propositions is -large.
Then there is the sense of predicativity in dependent type theory, where is closed under dependent products of -small h-propositions indexed by a -small type (the condition of a Heyting category), but which is still impredicative in the first sense.
Then there is the impredicativity in dependent type theory where is closed under dependent products of -indexed families of -small h-propositions.
Even stronger than the previous statement is that is closed under power sets.
Ah predicativity :heart_eyes: , this is a very good topic ... :zany_face:
Thanks a lot @Madeleine Birchfield for these useful remarks about DTT.
Madeleine Birchfield said:
I think the only person who has done something like this so far is Ulrik Buchholtz and his work on primitive recursive arithmetic in homotopy type theory, where the bottom universe level doesn't contain function types / dependent product types.
Found the preprint which talks about this:
https://arxiv.org/abs/2404.01011
Thanks for sharing!
Maybe we should also remind the audience of the predicative extension of Martin Hyland's effective topos given by @Maietti Maria Emilia and @Samuele Maschio , I found this a very interesting result too :heart_eyes:
and it's not because I love lex categories, completions and friends hahhaha :zany_face:
Morning btw!
John Baez said:
The path integral, and the infinities in quantum field theory, are something mathematical physicists care about a lot - so if any new approach to mathematics really helps, they will be interested. But this seems to be a hard problem.
Do the newer approaches formulated by Nima Arkani-Hamed like amplituhedrons and associahedrons suffer from the same problems with infinity as the path integral?
I'm not as knowledgeable about this as I'd like to be, so take this with a grain of salt:
These approaches are ways of calculating the path integral. I believe these approaches only handle simplified theories where the infinities are not present. Their advantage is that instead of expressing the path integral as an infinite sum of terms corresponding to "Feynman diagrams", they express it as the integral over a polytope.
For example, in 2013 Nina Arkani-Hamed and Jaroslav Trna expressed certain path integrals for "the planar limit of N = 4 super-Yang-Mills theory" as integrals over certain polytopes called amplitudohedra.
Yang-Mills theory is a very important 4-dimensional quantum field theory - when coupled to spin-1/2 particles and the Higgs, it gives the Standard Model. But in the theory Arkani-Hamed considers:
In short, the theory is made easier to deal with in 3 ways, and these 3 ways eliminate the infinities and allow for something closer to a 'closed form' solution.
So, it's quite far from real-world physics, but it's very pretty.
It's worth comparing pure Yang-Mills theory with Lie group SU(n). Here there is a 1 million dollar prize for eliminating the infinities and getting a well-behaved theory. But this is much harder than what Arkani-Hamed is doing, because by working with the planar limit of supersymmetric Yang-Mills theory one gets a theory with infinitely many conserved quantities, which is much more tractable.
From what I can tell, Arkani-Hamed's recent work in the subject is on the non-supersymmetric Tr() theory, i.e.:
https://arxiv.org/abs/2401.00041
That sounds like a big step toward greater realism, since supersymmetry is like a magic potion that cancels infinities: for example the infinite contribution to the vacuum energy caused by each boson is cancelled by the contribution caused by its fermionic 'superpartner'. This is why physicists like supersymmetry; there's just unfortunately not a shred of evidence that it's relevant to our universe.
However there are still big deviations from realism. It took me quite a while digging back into the literature to figure out the fields and Lagrangian for this theory! The whole approach goes so far from the traditional ones that they no longer talk much about these basic issues. But reading the references in the references in the references I found the answer here:
This naturally motivates the study of a theory of scalars in the adjoint of the product
of two different color groups . The simplest possibility is the theory with only cubic interactions of the form
In other words, the basic field in this theory is
for some integers , and the Lagrangian is the cubic expression shown above.
This is nothing like what we see in the real world.
I still can't tell if Arkani-Hamed and others are only considering the limit , which is another potent but physically unrealistic simplification, because in this limit the symmetry group becomes infinite-dimensional and (as I mentioned) all Feynman diagrams that aren't planar graphs become irrelevant.
I don't mind mathematical physicists studying theories that are made up because they are mathematically beautiful and easy to study! I've did a lot of this myself in my impressionable youth. I just don't like it when the pop science media neglect to tell us that this work is not about realistic physics.
Anyway, thanks for asking about this, because the work looks beautiful - and when I go back to the earlier papers I get a bit of hope that I could actually understand it.
It's always a pleasure to wake up with some first class physics!
and this time Yang-Mills!!! :heart_eyes:
Thanks a lot for the comments and the references but .. maybe we should stop talking about million dollar prizes or people will drop business school to join us hahahha
The hardest way to earn a million bucks is to solve a Clay Mathematics problem - and it's so hard that the only person who ever did it refused to accept the million bucks, because nobody who really cared about money would be working on such things.
Btw, I noticed that this paper:
only computes the scattering matrix for at 'tree level'. This means they came up with a formula for the sum over all Feynman diagrams that are trees with specified leaves. This is even more restrictive than the sum over planar Feynman diagrams!
Arkani-Hamed and company seem to have gone further, but I can't tell if they are looking just at a way to sum over planar diagrams, or all diagrams.
mmh I see. Well, this is the perfect reading for fall break ...
There is also this more recent paper from May of this year:
The problem with the new papers is that they prove lots of stuff about the theory but never say what that theory is. It's hard to jump on the train when it's already moving. That's why I need to look at old papers... or maybe review articles, if they exist.
16 messages were moved from this topic to #theory: category theory > Arkani-Hamed, amplitudohedron and associahedron by John Baez.
I was wondering, still dreaming about the million bucks :innocent: , do we have any result of existence and uniqueness of the measure, even before trying to construct it? - I know it may be a provocative question.
You're talking about the measure in the path integral for Euclidean Yang-Mills theory in 4 dimensions? No: if we knew this existed, it would be just a matter of engineering to actually build it. :smirk:
I don't know of anyone who has even tried to prove uniqueness.... uniqueness given some properties, but we don't know which properties to try....
Mmh .. thank you! Me and my ontological worries ...
Note that you only get the $1,000,000 prize if you not only construct the measure for 4d Yang-Mills, but also prove it has a 'mass gap': i.e., the spectrum of the Hamiltonian lies in for some . This means that there's a lightest particle, called a 'glueball', with positive mass.
hahahahah one step at time! Subdivision of tasks, you know .... :rolling_on_the_floor_laughing:
This whole prize question is mathematically much less well-defined than the other Clay math prizes, since nobody even knows what it means for some measure to be the measure for 4d Yang-Mills theory! To win the prize, I guess you'd have to convince the jury that you're doing the right thing - where 'right thing' is not yet well-defined.
Well, we have plenty of time ... hahahha