Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: theory: categorical probability

Topic: Universal property of the Giry monad

Nathaniel Virgo (Feb 04 2025 at 13:52):

This is idle wondering and I don't really expect the answer to be known, but maybe it will be an interesting conversation.

Chen's A universal characterisation of the category of standard Borel spaces characterises the category of standard Borel spaces as the universal one with certain nice properties. This is great because it means not only that we know a bunch of nice properties that this category has, but also that those are in some sense its only relevant ones when it comes to proving things about it.

However, that paper doesn't really mention Markov kernels or monads at all, so it doesn't immediately let us talk about probability.

So I'm wondering to what it extent it might be possible to construct the Giry monad, or $\bf BorelStoch$ , in a similarly abstract way.

In other words, if someone hands me a category and tells me that it's the initial countably complete countably extensive category, how can I use that information to construct the notions of measure and Markov kernel?

Benedikt Peterseim (Feb 04 2025 at 14:10):

This is a very interesting question! I hadn’t seen that paper before.

Benedikt Peterseim (Feb 04 2025 at 14:15):

Of course, there are constructions of probability monads as Codensity monads, but they presuppose that you have the unit interval in your category already. On the other hand, there is also work on universal characterisations of the unit interval, so maybe using one of these, one might actually obtain a universal property for BorelStoch. This seems far from straightforward, though.

Tobias Fritz (Feb 04 2025 at 15:12):

The following is pretty speculative, but I think a characterization like the following could be true and conceptually meaningful:

Conjecture: $\mathsf{BorelStoch}$ is the free Markov category with countable Kolmogorov products generated by the Markov category $\mathsf{FinStoch}$ .

For the morphisms, I imagine using strong symmetric monoidal functors presering copy. This would be in a bit of a different direction than Chen's result though.

Tobias Fritz (Feb 04 2025 at 15:16):

More along the lines of Chen's paper, perhaps one could consider something like the free representable Markov category that is countably distributive?

Tobias Fritz (Feb 04 2025 at 16:27):

Perhaps worth mentioning is also the idea of free affine commutative monads, which my student @Victor Bloch is working on. Similar to how the free monad generated by a binary operation is the monad of binary trees, Victor has constructed the free affine commutative monad generated by a binary operation on Set, and we seem to be able to show that it embeds into the distribution monad in a way that is topologically dense. (Very roughly, think of approximating a distribution by repeated coin flips.)

Moreover, we conjecture that the Radon monad, which is the standard probability monad on the category of compact Hausdorff spaces, literally is the free affine commutative monad generated by a binary operation. But so far we don't know how to approach this problem.

Binary operations arguably aren't special in this story, and I'd expect everything to work the same if one starts with an operation of any finite arity.

Benedikt Peterseim (Feb 04 2025 at 16:38):

Tobias Fritz said:

Perhaps worth mentioning is also the idea of free affine commutative monads, which my student Victor Bloch is working on. Similar to how the free monad generated by a binary operation is the monad of binary trees, Victor has constructed the free affine commutative monad generated by a binary operation on Set, and we seem to be able to show that it embeds into the distribution monad in a way that is topologically dense. (Very roughly, think of approximating a distribution by repeated coin flipping.)

This is interesting—so you get a construction of the dyadic rational unit interval as M(2) entirely for free? (where M is this free commutative affine monad) Also, if the conjecture about the Radon monad is true, I suppose that would give another (very natural, in my opinion) universal chracterisation of the unit interval?

Tobias Fritz (Feb 04 2025 at 16:41):

It's actually not the dyadic interval -- it's all those numbers that you can get by repeatedly flipping a biased coin with transcendental bias!

Benedikt Peterseim (Feb 04 2025 at 16:41):

By the way, are you using midpoint algebras (https://ncatlab.org/nlab/show/midpoint+algebra) or variations of this in any way? I think there are characterizations of the unit interval that use this iirc, so I’m thinking it might be relevant to the conjecture regarding the Radon monad.

Benedikt Peterseim (Feb 04 2025 at 16:41):

Tobias Fritz said:

It's actually not the dyadic interval -- it's all those numbers that you can get by repeatedly flipping a biased coin with transcendental bias!

Oh, I’m surprised and confused. :)

Tobias Fritz (Feb 04 2025 at 16:41):

The reason is that the defining operation of a midpoint algebra is commutative, so it cannot be the generating operation in a free monad, and not even in a free affine commutative monad :sweat_smile:

Tobias Fritz (Feb 04 2025 at 16:42):

This is clear once you note that not all binary operations in a commutative monad are commutative.

Benedikt Peterseim (Feb 04 2025 at 16:43):

Ah, I think I’m starting to see this, okay. :)

Tobias Fritz (Feb 04 2025 at 16:45):

The difficult logical part of this is from On convex linear forms, a 1974 paper by Fajtlowicz and Mycielski, who show that the theory of binary convex combinations with a transcendental coefficient is axiomatized by two equations:
image.png
Intuitively, these two equations say precisely that the monad is affine resp. commutative.

Tobias Fritz (Feb 04 2025 at 16:46):

A good way to understand the second equation is to note that it amounts to the operation distributing over itself, or what we would call commuting with itself. Like how in a commutative monad any two operations commute.

Tobias Fritz (Feb 04 2025 at 16:50):

So these algebraic structures are midpoint algebras without the commutativity. They seem to be called modes.

Tobias Fritz (Feb 04 2025 at 16:52):

Maybe you'd have some idea for how to approach the case of compact Hausdorff spaces @Benedikt Peterseim ? Happy to talk more in case you're interested.

Benedikt Peterseim (Feb 04 2025 at 18:05):

Not any specific idea right now, but it does seem of a similar flavour to some things I’ve thought about before. Would definitely be interested in some more details and in sharing some thoughts; feel free to DM me.

Morgan Rogers (he/him) (Feb 05 2025 at 09:14):

Benedikt Peterseim said:

Of course, there are constructions of probability monads as Codensity monads, but they presuppose that you have the unit interval in your category already. On the other hand, there is also work on universal characterisations of the unit interval, so maybe using one of these, one might actually obtain a universal property for BorelStoch.

You don't need a universal characterization of the unit interval, it's enough to have a construction of the unit interval using the limits and colimits available in the characterization of BorelStoch. The subtlety is just that while there is a unique uncountable measure space, there are non-equivalent ways of equipping it with this structure. We already have the trivial (two element) interval. Constructing the diadic interval is easy enough. Fortunately the operations involved in the standard unit interval are continuous (except division at 0, which is still okay) and hence measurable, so this should be possible.

Morgan Rogers (he/him) (Feb 05 2025 at 09:15):

A consequence would be that you can do probability theory over any countably complete Boolean countably lextensive category with the same suite of interval objects.

Nathaniel Virgo (Feb 08 2025 at 19:55):

Morgan Rogers (he/him) said:

A consequence would be that you can do probability theory over any countably complete Boolean countably lextensive category with the same suite of interval objects.

I'm kind of confused about this point actually. Assuming we can construct the Giry monad from the limits and colimits in the characterisation of StdBorel, however we do it, shouldn't that mean we can do it in any countably complete Boolean countably extensive category? Since Set is such a category (right?), this seems to imply we'd get for free a monad on Set that inherits all the nice properties of the Giry monad. That would be great if it's true, but it sounds like too much to hope for. Presumably something would prevent that - do you have an idea what it would be?

Nathaniel Virgo (Feb 08 2025 at 22:00):

With that caveat in mind, here's another conjecture about how to answer this version of my question:

Nathaniel Virgo said:

In other words, if someone hands me a category and tells me that it's the initial countably complete Boolean countably extensive category, how can I use that information to construct the notions of measure and Markov kernel?

Take an object $X$ and construct $X^\mathbb{N}$ as a countable product. Consider the group of finite permutations of $\mathbb{N}$ and its action on $X^\mathbb{N}$ . I think we have enough structure to construct the object of orbits of this group action. I conjecture that this is $P(X)$ , the Giry monad applied to $X$ .

This is because of de Finetti's theorem. Whatever the quotient of this group action is, measures over it ought to be basically the same as exchangeable measures over $X^\mathbb{N}$ , which are the same as measures over $P(X)$ . So in particular, elements of it should just be elements of $P(X)$ . (Since over standard Borel spaces an element is the same as a measure concentrated on that element.)

If that's right then one can probably get the monad structure by considering a bijection between sequences of sequences of elements of $X$ and sequences of elements of $X$ .

(Although maybe there is some weird measure-theoretic reason why this doesn't work. Usually when I make a guess about measure theory it turns out to be wrong.)

Kevin Carlson (Feb 09 2025 at 07:32):

Doesn't the Giry monad restrict to a monad on Set along the inclusion of sets as discrete measurable spaces?

John Baez (Feb 09 2025 at 08:02):

I don't think so: the Giry monad maps a measurable space to the measurable space of probability measures on it, but the measurable space of probability measures on a discrete measurable space is not itself discrete, e.g. the probability measure space on 2 is isomorphic to [0,1] with its usual Borel sigma-algebra, which is different than the sigma-algebra of the discrete measurable space [0,1] (namely the sigma-algebra of all subsets of [0,1]).

Kevin Carlson (Feb 09 2025 at 08:52):

Ah, right.

Morgan Rogers (he/him) (Feb 09 2025 at 17:21):

Nathaniel Virgo said:

Morgan Rogers (he/him) said:

A consequence would be that you can do probability theory over any countably complete Boolean countably lextensive category with the same suite of interval objects.

I'm kind of confused about this point actually. ...shouldn't that mean we can do it in any countably complete Boolean countably extensive category? Since Set is such a category (right?), this seems to imply we'd get for free a monad on Set that inherits all the nice properties of the Giry monad.

I expect that the confusion comes from assuming that the monad thus constructed will inherit the properties of the Giry monad? We could figure this out explicitly, though: what is an example of a nice property that the Giry monad has that you expect to fail when the construction is transported to Set?

Nathaniel Virgo (Feb 10 2025 at 11:54):

If I try to be more precise I suppose that one of the following things must happen: either the monad needs initiality of StdBorel to construct it (hence you can't construct the same monad on Set), or some of its nice properties need initiality of StdBorel to prove them (hence its counterpart on Set doesn't have those properties), or the Set version does inherit all the nice properties but becomes trivial in some way, or at least isn't useful for probability in the same way that BorelStoch is. Or none of the above, which is the best possible outcome but seems unlikely. Or something else and I'm more confused than I thought, which is quite possible.

Specifically, the properties I'm thinking of are the following, which I regularly use to reason about its Kleisli category, BorelStoch:

its Kleisli category is a representable Markov category, which I think means the monad has this property, per Fritz+Gonda+Perrone+Rischel's paper on representable Markov categories
image.png
its Kleisli category has conditionals, in a way that's compatible with representability
its Kleisli category has countable [[Kolmogorov products]], which I think means the monad preserves a certain cofiltered limit

These are basically the assumptions used in Fritz+Gonda+Perrone's abstract proof of de Finetti's theorem.

Tobias Fritz (Feb 11 2025 at 15:31):

Nathaniel Virgo said:

Take an object $X$ and construct $X^\mathbb{N}$ as a countable product. Consider the group of finite permutations of $\mathbb{N}$ and its action on $X^\mathbb{N}$ . I think we have enough structure to construct the object of orbits of this group action. I conjecture that this is $P(X)$ , the Giry monad applied to $X$ .

This is similar in spirit to what we're currently working on with the law of large numbers. This is at a sufficiently advanced stage that we could possibly share a draft if you're interested.

What I imagine that you have in mind is to consider a map $X^{\mathbb{N}} \to PX$ which takes every sequence to the associated "empirical measure" defined by the relative frequencies of the elements in the sequence. The first problem now is that this doesn't make sense in general, because many sequences don't have well-defined relative frequencies. Therefore the transformation can at most be partially defined. Another issue is that you can have two sequences for which the relative frequencies do converge, and even be identical so that you get the same element of $PX$ , even if the sequences do not differ by a finite permutation. (They don't even necessarily differ by an infinite permutation, because one may contain elements with relative frequency zero!) So although a partial transformation exists, there's no simple characterization of when two elements of its domain get mapped to the same element of the codomain.

(There are further analytical subtleties that we've had to struggle with, which is part of why it's taken us a few years to get to the point where we have a draft.)

Tobias Fritz (Feb 11 2025 at 15:31):

With that being said, I think that your construction could nevertheless give an affine commutative monad on Meas or BorelMeas, so that its Kleisli category is going to be a Markov category. Maybe this Markov category is of some interest even if it's not Stoch? For one thing, this construction clearly works on Set too.

Tobias Fritz (Feb 11 2025 at 15:36):

Nathaniel Virgo said:

This is because of de Finetti's theorem. Whatever the quotient of this group action is, measures over it ought to be basically the same as exchangeable measures over $X^\mathbb{N}$ , which are the same as measures over $P(X)$ . So in particular, elements of it should just be elements of $P(X)$ . (Since over standard Borel spaces an element is the same as a measure concentrated on that element.)

This indeed gives a construction of PX, but it's different in two ways from the quotient that you've described:

It's the limit rather than colimit of $X^{\mathbb{N}}$ with respect to the action of finite permutations. Basically this is because the de Finetti theorem in the form that you've quoted is about mapping into $X^{\mathbb{N}}$ and $PX$ rather than out. (If you're willing to quotient everything by almost sure equality with respect to some reference measure, then you get a dagger category and can also view this as a colimit. @Paolo Perrone in particular could say more about this.)
This limit is not a limit in BorelMeas, but actually in BorelStoch! Because that's where the measures are morphisms.

Tobias Fritz (Feb 11 2025 at 15:40):

So although we can indeed characterize the Giry monad like this, doing so requires starting with the Markov category already. I still think that this is interesting, and this is what we're doing with the law of large numbers: we assume that we have a Markov category with additional structure that makes it have a law of large numbers, and then we derive representability from that.

Tobias Fritz (Feb 11 2025 at 15:41):

Conceptually, I think it's good practice to think of Markov categories as the primary structures and the monads as derived rather than the other way around.

Nathaniel Virgo (Feb 11 2025 at 18:31):

Ah, of course, that all makes complete sense. I know all these facts very well, I had just not realised I was ignoring them!

Still, I wonder if there's a way to fix it. It would be kind of neat to turn de Finetti's theorem into the definition of a probability measure, if it can be done.

I agree with the general principle of seeing Markov categories rather than monads as primary. It's just that it seems the work of characterising the deterministic subcategory of BorelStoch has already been done, so I was speculating about ways to get from there to an abstract characterisation of BorelStoch itself.

Morgan Rogers (he/him) (Feb 12 2025 at 14:15):

Reviewing the [[Giry monad]] page, I see that the definition is more complicated than I had naively misremembered it was; it's not just a matter of homming into an interval, for instance. The definition of probability measure on an object of a countably (co)complete Boolean lextensive category is a monotone function from its BA of subobjects to $[0,1]$ satisfying countable constraints (these can either be expressed in terms of countable increasing chains or countable disjoint elements) and modularity (see [[probability valuation]] for that condition, which amounts to inclusion-exclusion). On the face of it, there is no reason for the collection of such maps to have the structure of an object of the same Boolean category, nor does the interval itself need to be a part of the Boolean category for this definition to make sense.

On the other hand, suppose that $[0,1]$ , rather than being a mere object of a Boolean category, were instead realised as the lattice of subobjects of some structure $I$ in such category. Suppose further that this type of structure were such that every object comes with a trivial such structure (whose substructures are in correspondence with subobjects) and that the category of such structures is enriched over the Boolean category. Then the hom-object $I \to X$ could recover the collection of monotone bla-bla-bla functions from subobjects of X to $[0,1]$ by inverse images. That's the best version of what I was imagining that I can come up with, but it has too many holes to actually be usable: it's not even clear to me if the kind of structure that's needed is something resembling an order structure, something quantale-like or the action of a symmetric group, say (permutations of a countable set, say, if this is somehow directly related to di Finetti?)
If anyone wants to run with that idea or shoot it down, be my guest. :wink:

Sam Staton (Feb 12 2025 at 16:47):

Hi I just noticed this thread (late again!) @Tobias Fritz and @Victor Bloch, you were talking about the free affine commutative monad generated by a binary operation. Are you sure it embeds in the distribution monad? If I understand what you wrote correctly, it should be something like the Bernstein polynomials. We investigated this in Beta-Bernoulli processes and algebraic effects, and the characterization result is Proposition 7. If I understand what you said, you are looking at the case $l=1$ , and we notated the binary operation $?_p$ . But perhaps you had something different in mind, which would also be very interesting!

Sam Staton (Feb 12 2025 at 16:54):

(Sorry to jump back! I think Morgan's question is interesting, the reason why Borel probability kernels form a Kleisli category is a bit of a mystery, and didn't mean to deflect!)

Tobias Fritz (Feb 12 2025 at 18:43):

Sam Staton said:

Hi I just noticed this thread (late again!) Tobias Fritz and Victor Bloch, you were talking about the free affine commutative monad generated by a binary operation. Are you sure it embeds in the distribution monad? If I understand what you wrote correctly, it should be something like the Bernstein polynomials. We investigated this in Beta-Bernoulli processes and algebraic effects, and the characterization result is Proposition 7. If I understand what you said, you are looking at the case $l=1$ , and we notated the binary operation $?_p$ .

Thanks for the reminder about that paper, @Sam Staton ! It's a challenging read for someone with a background like mine, and I had tried and failed before, but right now I think I'm understanding a little more.

Would it be correct to say that if one phrases your formalism as an algebraic theory, then your $\nu_{i,j}$ are like having access to the rational numbers $\frac{i}{i+j}$ as constants, so that your algebraic theory includes the theory of rational convexity in particular?

Tobias Fritz (Feb 12 2025 at 18:49):

In Section 5, you say that your monad lives on the functor category [FinSet, Set], and indicate that this is because it's a parametrized algebraic theory. Is it possible to sketch the intuition behind this without giving the full definition of what a parametrized algebraic theory is?

Tobias Fritz (Feb 12 2025 at 18:53):

In any case, I do see some similarity with what we're doing, even if it's perhaps not exactly the same. I agree that the Bernstein polynomials are related in some way, but we haven't actually needed them explicitly: they are under the hood in the results in On convex linear forms, which we're simply using as a black box.

Tobias Fritz (Feb 12 2025 at 18:54):

Maybe it helps to look at their result:
image.png
Is this an instance of your Proposition 7?

Tobias Fritz (Feb 12 2025 at 19:05):

To my mind, the main takeaway of these results is more philosophical rather than technical, namely the idea that probability monads are very close to being free in a certain sense. For example, it seems plausible that the Radon monad on CHaus is the free affine commutative monad generated by a binary operation, or even by an operation of any arity $\ge 2$ (though we haven't been able to prove it yet). Is this kind of idea something that you've known already or seen somewhere else?

Sam Staton (Feb 13 2025 at 07:08):

Thanks @Tobias Fritz ! I don't think zulip has threads so I don't know how to discuss multiple things at once but there's lots of interesting stuff in your message!

For our result, maybe the paper you posted could be a corollary (thanks for the reference, I didn't know it but it is nice) but I think it is quite different. Our paper is quite condensed, I'd quite like to now write a Markov category version. Here is the statement $l=1$ extracted and without the other probabilities or $\nu$ around. I hope I haven't made a mistake.

Sam Staton (Feb 13 2025 at 07:08):

As you know we can make a finite distributions monad $T_R$ for any commutative semiring $R$ . It's the submonad of the modules monad where we only take things that sum to $1$ .

Now consider the commutative semiring $R=\langle x,y\ |\ x+y=1,xy=yx\rangle$ . We should think of $y=(1-x)$ . This could be called the Bernstein semiring. It should be a subsemiring of the polynomial ring $\mathbb{Z}[x]$ .

Theorem (if I'm remembering correctly): The free commutative affine monad with one binary operation is $T_R$ .

Sam Staton (Feb 13 2025 at 07:09):

I can see that someone (maybe Sean Moss) has typed up a tidy proof of this very fact in our repository, we could perhaps share it. But first I wanted to check whether this is the result you are looking at with @Victor Bloch.

Tobias Fritz (Feb 13 2025 at 07:25):

Oh I see, that's a really nice and intriguing way to put it!

Tobias Fritz (Feb 13 2025 at 07:27):

No, we haven't known any explicit description like that. Perhaps it could be helpful ~~(in combination with Stone-Weierstrass)~~ for tackling the case of CHaus and the Radon monad.

Tobias Fritz (Feb 13 2025 at 07:30):

Maybe it would make sense to join forces on this and write it all up in one document? What we have so far is a draft of Victor's thesis with the formal construction of the free affine commutative monad as a quotient of the free monad, the fact that it's the monad of modes, and the embedding into the distribution monad.

Tobias Fritz (Feb 13 2025 at 07:39):

(I had asked Bart last year if he already knew of anything in this direction, which was not the case, but I now realize that I should have asked you too!)

Morgan Rogers (he/him) (Feb 13 2025 at 16:50):

Sam Staton said:

Thanks Tobias Fritz ! I don't think zulip has threads so I don't know how to discuss multiple things at once but there's lots of interesting stuff in your message!

Splitting off new topics is encouraged. Linking is easy enough to connect different topics when needed :innocent:

Victor Bloch (Feb 15 2025 at 06:11):

Sam Staton schrieb:

As you know we can make a finite distributions monad $T_R$ for any commutative semiring $R$ . It's the submonad of the modules monad where we only take things that sum to $1$ .

Now consider the commutative semiring $R=\langle x,y\ |\ x+y=1,xy=yx\rangle$ . We should think of $y=(1-x)$ . This could be called the Bernstein semiring. It should be a subsemiring of the polynomial ring $\mathbb{Z}[x]$ .

Theorem (if I'm remembering correctly): The free commutative affine monad with one binary operation is $T_R$ .

This construction of the free affine commutative monad looks a lot like our construction as the free mode monad. The constructions seem to be isomorphic: We can think of the elements of $T_RX$ as formal convex combinations in $X$ with parameters in $R$ . This viewpoint then gives (it at least looks like it does) the free mode on $X$ .

Sam Staton (Feb 15 2025 at 11:17):

Thanks very much! I see the connection with the other paper now, the variable $x$ can be thought of as a trancendental.

Sam Staton (Feb 15 2025 at 11:17):

I just remembered that the version of the Theorem above is Example 5 in our LAFI abstract semirings, monads, and tensors, which we ought to get back to. Happy to collaborate! The idea there was to combine monads by combining semirings, partly as a way of building Markov categories, but also with other applications.

Sam Staton (Feb 15 2025 at 11:19):

I was interested in the polynomials-as-functions view because I wanted to think about this operation as tossing an unknown coin: what can we know about it? And then these equations are all that we can say. I was a bit inspired by the "Bernoulli factory" work.

I think perhaps this is philosophically different from the transcendental interpretation, do you agree? Even though the theory is the same. This might come out when we add limits and recursion. I was expecting to get continuous functions, or something like it, coming from the idea of the Bernstein polynomial basis or the Bernoulli factory results. Whereas I think you were hoping to just get all real probabilities? One route we were thinking of was a possible connection between *-semirings and iteration theories, which we were looking at but which I think is still hanging.

Sam Staton (Feb 15 2025 at 11:20):

(The Beta-Bernoulli paper is also in this unknown-coin direction, basically we have a collection of these unknown coins, ie a tensor of the free affine monad with one binary generator with itself $l$ times, and this indexing $l$ is why it is presheaf-enriched; plus $\nu$ has a complete interpretation as integrating-out one of the polynomials, even though the axiomatization is finite and simple. Happy to discuss more!)

Tobias Fritz (Feb 16 2025 at 17:18):

Sam Staton said:

I just remembered that the version of the Theorem above is Example 5 in our LAFI abstract semirings, monads, and tensors, which we ought to get back to.

Cool! So the proof of that isomorphism isn't yet available anywhere?

Tobias Fritz (Feb 16 2025 at 17:20):

Sam Staton said:

I was expecting to get continuous functions, or something like it, coming from the idea of the Bernstein polynomial basis or the Bernoulli factory results. Whereas I think you were hoping to just get all real probabilities?

I see. Yes, we've been hoping to get all real probabilities. Concerning my conjecture that the free affine commutative monad generated by a binary operation on CHaus would be the Radon monad, @Benedikt Peterseim has had some nice insights that go in the direction of a disproof.

Tobias Fritz (Feb 16 2025 at 17:21):

But even if that was true, it would be less convincing than obtaining such a universal property for the Giry monad on standard Borel spaces directly. Here's one way in which this might work:

Tobias Fritz (Feb 16 2025 at 17:22):

Conjecture: The Giry monad on BorelMeas is freely generated by a binary operation, among all affine commutative monads which moreover preserve countable cofiltered limits.

Tobias Fritz (Feb 16 2025 at 17:27):

The idea here is that the Giry monad first of all does preserve countable cofiltered limits; that's one version of the Kolmogorov extension theorem. For the intuition behind the conjectured universality, think of building the free affine commutative monad inductively by repeatedly applying the generating binary operation and imposing the relevant equations. Then the preservation of cofiltered limits gives us the additional power to create generalized elements of $\{0,1\}^{\mathbb{N}}$ by constructing a compatible family of marginals. In terms of a coin with transcendental bias, this amounts to allowing protocols with infinitely many coin flips! The point now is that if we allow infinite protocols, then we should be able to use that coin to simulate any other coin with arbitrary bias, and in particular get all real probabilities. Formally, this is by combining the preservation of cofiltered limits with functoriality with respect to measurable maps $\{0,1\}^{\mathbb{N}} \to \{0,1\}$ , which will create new generalized elements of $\{0,1\}$ . This will hopefully make the resulting set over $\{0,1\}$ become exactly $[0,1]$ .

Tobias Fritz (Feb 16 2025 at 17:29):

Sam Staton said:

I think perhaps this is philosophically different from the transcendental interpretation, do you agree?

I admittedly don't understand the Bernoulli factory stuff :see_no_evil: I do think of the free affine commutative monad, or rather of its Kleisli category, as a Markov category formalizing the processes that one can build by having access to a coin of unknown bias.

Tobias Fritz (Feb 16 2025 at 17:37):

Perhaps we should discuss how to proceed with regards to writing, authorship etc in private? Victor and I are preparing an ACT submission on this (with deadline on Feb 24) and should be able to share it by tomorrow. It could be nice if you wanted to contribute the Bernstein semiring observations to that. On the other hand, if I understand correctly, you're working on including this in an extended version of the LAFI paper? Of course that would be fine for us too.

It should also be okay if we don't have any definite plan by Feb 24, since our ACT submission is only for a talk and we'll try to publish elsewhere. Perhaps JPAA could be a good venue for the sort of thing that we have in mind, but a CS journal could be a good option as well especially if there is more content in that direction. (I generally try to publish in journals rather than conferences due to the higher quality of peer review.)

Nathaniel Virgo (Feb 16 2025 at 19:03):

Tobias Fritz said:

The idea here is that the Giry monad first of all does preserve countable cofiltered limits; that's one version of the Kolmogorov extension theorem.

Ooh, is there a reference/proof for this fact? I've been wondering about it recently.

Sam Staton (Feb 17 2025 at 17:09):

One reference is Dirichlet is natural, where it is called Bochner extension, there may be earlier ones too. There it is restricted to surjective diagrams, which may be important?

Sam Staton (Feb 17 2025 at 17:25):

Tobias Fritz said:

Conjecture: The Giry monad on BorelMeas is freely generated by a binary operation, among all affine commutative monads which moreover preserve countable cofiltered limits.

Thanks, this is an interesting conjecture. I'll think about it!

We had previously thought of freely adding some cofiltered limits to $\mathbf{FinStoch}$ , and then I think we arrived at Stone spaces and Radon probability kernels. Maybe that is analogous to what @Nathaniel Virgo was asking in the first place.

But this conjecture about $\mathbf{BorelMeas}$ is an interesting idea, to start from $\mathbf{BorelMeas}$ rather than freely adding the limits themselves.

I know there is @Tobias Fritz's paper with @Paolo Perrone about probability as a colimit, which seems similar in spirit to me? Also in domain theory something like this happens, the probabilisitic powerdomain includes all measures but can be generated by one binary operation, although it is a bit complicated.

Nathaniel Virgo (Feb 17 2025 at 19:06):

I might as well throw another conjecture/question into this thread, as someone here will probably know the answer: does BorelStoch have all countable cofiltered limits? I think this is a bit stronger than claiming the Giry monad preserves countable cofiltered limits, in that so far I haven't been able to prove abstractly from the things I know about BorelStoch (e.g. that it's representable, that $\textbf{BorelStoch}_{det}$ has countable limits, BorelStoch has Kolmogorov products etc.) and it does't seem like assuming the Giry monad preserves all cofiltered limits immediately helps. (It gives you some of them but not all of them, at least not straightforwardly.) But at the same time it doesn't seem much stronger so I'm still wondering if it might be true.

Paolo Perrone (Feb 17 2025 at 22:03):

This proposition might help, if I understand correctly.

Eigil Rischel (Dec 18 2025 at 17:16):

Hey everyone—I haven't really been part of the discussion here, but I've recently taken a good chunk out of this problem—the preprint is here: https://arxiv.org/abs/2512.15485

My result is that $\mathsf{Borel} \to \mathsf{BorelStoch}$ (writing $\mathsf{Borel}$ for standard Borel spaces and measurable maps) is the initial way of adding an unbiased coinflip (suitably axiomatized), among Markov functors that preserves countable (determinitsic) coproducts, countable Kolmogorov products and pullbacks along deterministic coproduct inclusions (that is, strong Markov functors so that the restriction to the deterministic subcategories is countably extensive and countably continuous).

(An unbiased coinflip in this sense is unique if it exists, so this is really a completion, not a free completion).

This is a different version of Tobias' conjecture above: the Giry monad is generated by a binary operation + the equations of a fair coinflip (concretely: $f(a,b) = f(b,a), f(a,a) = a, f(f(a,b),f(c,d)) = f(f(a,c),f(b,d))$ ), among monads on standard Borel spaces which preserve countable Kolmogorov products and pullbacks along monomorphisms.

Eigil Rischel (Dec 18 2025 at 17:36):

Yes, thank you. I've fixed the link now

Nathanael Arkor (Dec 18 2025 at 18:42):

Is there a way to see a coinflip structure on $\mathcal C$ in terms of enrichment in the category of midpoint algebras in $\mathcal C$ ?

David Corfield (Dec 19 2025 at 12:42):

@Eigil Rischel, you reference

[4] M.H. Escardo and A.K. Simpson. A universal characterization of the closed euclidean interval. In Proceedings 16th Annual IEEE Symposium on Logic in Computer Science, page 115–125, June 2001.

Note that the authors have now written a "full version" of the "extended abstract" that is [4] in:

Martin Escardo, Alex Simpson, Euclidean interval objects in categories with finite products

Paolo Perrone (Dec 19 2025 at 13:58):

@Eigil Rischel This is very nice work.
What I like the most is that one of the most precious properties of BorelStoch, namely having conditionals, is obtained from the other axioms without needing to be baked in!

Do you have a feeling of 'how this happens'? As in, which ones do you think are the properties that let us have conditionals, and how do they arise in this way?

Eigil Rischel (Dec 19 2025 at 14:48):

It's hard to extract a formal statement of this type because conditionals satisfy their defining equation only up to almost sure equality. But the obstruction to having conditionals is usually something "topological", like epimorphisms that don't split. So in some sense the point is that there are "enough" measurable maps (I guess the Boolean axiom is what guarantees this).

Benedikt Peterseim (Dec 24 2025 at 10:07):

@Eigil Rischel Very nice results, thanks for sharing! I have some questions about the initiality result for Finstoch, maybe you have thought about these things.

1) What happens when you drop the "externally iterable" condition in this characterisation? Do we get the dyadic rational version of FinStoch?
2) Can we replace the "externally iterable" condition by assuming a suitable action of scalars in [0,1] on the internal midpoint algebra structure of each object? Or (equivalently, I assume), assuming a "biased coin flip" on a Markov category?

The reason I am interested in this: there is complete equational theory of Finstoch (see https://arxiv.org/pdf/2408.14701). However, some of the equations are quite unwieldy and not at all obvious. Translating the characterisation from your paper to equations might give simpler axioms, and should be possible except for the iterability condition, I think.

Ralph Sarkis (Dec 24 2025 at 12:03):

Piedeleu et al. give an axiomatisation of BinStoch, the category of stochastic matrices whose dimensions are powers of 2. There is another axiomatisation of BinStoch by Di Giorgio et al. (Theorem 11 here) with cuter axioms imo.

In both approaches, the midpoint algebra on homsets (what Eigil calls coinflip) is derived from 1) a uniformly random state and 2) a multiplexer (or if then else, or probabilistic choice, or convex combination) operator. The multiplexer is further derived from ANDs and NOTs in Piedeleu et al.

Di Giorgio et al. also talk about iteration.

Neither paper studies the distributive structure that would yield an axiomatisation of the whole category FinStoch.

Benedikt Peterseim (Dec 24 2025 at 12:06):

Oh right, sorry about the BinStoch/FinStoch confusion here, and thanks for the pointer to this paper!

Benedikt Peterseim (Dec 24 2025 at 12:12):

But I guess my questions still make sense. And so does there potential application to an equational axiomatisation of FinStoch? The latter is left as future work in the Piedeleu et al. paper if I am not mistaken, but they haven’t followed up on this so far, as far as I am aware?

Ralph Sarkis (Dec 24 2025 at 12:52):

Not yet. There is preliminary stuff combining the probabilistic circuits with tape diagrams to handle the distributive structure here (Example 30).

Eigil Rischel (Dec 24 2025 at 13:46):

Benedikt Peterseim said:

Eigil Rischel Very nice results, thanks for sharing! I have some questions about the initiality result for Finstoch, maybe you have thought about these things.

1) What happens when you drop the "externally iterable" condition in this characterisation? Do we get the dyadic rational version of FinStoch?
2) Can we replace the "externally iterable" condition by assuming a suitable action of scalars in [0,1] on the internal midpoint algebra structure of each object? Or (equivalently, I assume), assuming a "biased coin flip" on a Markov category?

I'm fairly confident the answer to 1 is yes. As for 2, I think this is essentially the presentation of stochastic matrices given by Tobias in https://arxiv.org/abs/0902.2554

Sam Staton (Dec 24 2025 at 13:56):

Hi, just to add a perspective regarding Benedikt's question: FinStoch and the dyadic version are dual to the Lawvere theories for convex spaces and midpoint algebras respectively. So the answer to (1) is indeed yes, if I understood the question right, and I think it's quite well known. I just added some things at nlab FinStoch.

Ralph Sarkis (Dec 24 2025 at 14:13):

Tobias's paper talks about FinStoch with the coproduct (direct sum of matrices), not the usual monoidal product that people consider in categorical probability (Kronecker product of matrices).

I think this also applies to what Sam said right ? The Lawvere theory has finite products which correspond to coproducts in the dyadic version of FinStoch, not to the Kronecker product.

The presentations of BinStoch above are with the usual monoidal product, and the work on tape diagrams has both the Kronecker product and the coproduct.

Sam Staton (Dec 24 2025 at 14:32):

Hi Ralph. For the dyadic one, in the Lawvere theory of midpoint algebras as presented at nlab, the third axiom says that it's commutative (equivalently, the dual is a distributive symmetric monoidal category, like tape diagrams) and the first axiom says that the monoidal unit is initial (ie semicartesian in the dual). That just leaves the second axiom, which can be written as a plain categorical axiom, $s\circ c=c$ where $s$ is the symmetry on $1+1$ . So we can say that it's a free distributive semicartesian category generated by $c:1\to 1+1$ and satisfying this second axiom. Does that make sense?

Ralph Sarkis (Dec 24 2025 at 15:24):

I see, thanks.

Benedikt Peterseim (Dec 24 2025 at 15:38):

Sam Staton said:

Hi, just to add a perspective regarding Benedikt's question: FinStoch and the dyadic version are dual to the Lawvere theories for convex spaces and midpoint algebras respectively. So the answer to (1) is indeed yes, if I understood the question right, and I think it's quite well known. I just added some things at nlab FinStoch.

Thanks! This does indeed answer my question. Is there a published reference for this (e.g. the dyadic case you sketched), or is this “folklore”?