Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: theory: categorical probability

Topic: expectation


view this post on Zulip Spencer Breiner (May 05 2022 at 21:07):

Hi all. If I have a state in a Markov category, what do I need to assume to be able to compute an expectation, and what does that look like diagrammatically?

view this post on Zulip Spencer Breiner (May 05 2022 at 21:14):

I think I would like to appeal to the law of large numbers and to say something like the limiting average of i.i.d. copies of the state, which should limit to a Dirac distribution under some nice conditions. But maybe this characterization should be a theorem instead?
E(p)=limn1n(np)E(p) = lim_{n\to\infty} \frac{1}{n}(n\cdot p)

view this post on Zulip Spencer Breiner (May 05 2022 at 21:18):

I suppose that in many cases an expectation is an algebra for the probability monad, but I don't know what that looks like in general Markov categories.

view this post on Zulip Spencer Breiner (May 05 2022 at 21:20):

I imagine this should involve "representable" Markov categories and distribution objects to dig deterministic values out of markov processes, but I didn't see any systematic discussion of expectation and variance in the that paper.

view this post on Zulip John Baez (May 05 2022 at 23:41):

I can't help but I like the use of the term "dig" here.

view this post on Zulip Nathaniel Virgo (May 06 2022 at 01:02):

Spencer Breiner said:

I suppose that in many cases an expectation is an algebra for the probability monad, but I don't know what that looks like in general Markov categories.

A long while ago I thought a bit about generalising algebras of probability monads to general Markov categories. It seems like you want the following thing:

Don't take that too seriously though, it's sort of just a guess, and I might have mis-remembered it. You probably have to consider the Markov category as enriched in measurable spaces for it to work properly.

One thing to note is that algebras of probability monads are not necessarily expectations, as there are algebras you can construct from partial orders that are not of that form. (See Fritz' paper on convex spaces, for example.) So if you really wanted to characterise expectations you'd have to add some extra axioms to get rid of those.

view this post on Zulip Tobias Fritz (May 06 2022 at 06:43):

Hi! To understand your problem a little better @Spencer Breiner, on what sort of object does your state live? Is it some version of the real line? This is the most commonly assumed case in classical probability, where distributions on the real line and their expectations (whenever they exist) play a central role.

So far, we haven't thought much about how to do this in Markov categories in a good way. Why not? Mainly because real-valued random variables and distributions on R\mathbb{R} are surprisingly redundant in the foundations of probability. For example, we do have some preliminary work on the law of large numbers (see this talk), and the real numbers are not involved in any way! Rather, from the categorical perspective it is more natural to work with objects of distributions, and to formulate the law as something like this: when you make repeated draws from a given distribution pp, then the resulting empirical distribution converges to pp. The Glivenko-Cantelli theorem is one way to make this precise. So basically, we use free algebras instead general ones, and that goes a surprisingly long way. But I think you know this already.

view this post on Zulip Tobias Fritz (May 06 2022 at 06:47):

That being said, @Nathaniel Virgo's tentative definition of algebra looks intriguing! If something along those lines works, then it's surely an instance of some general theorem on how to encode an algebra of a monad (say with monic unit) as a structure on the Kleisli category.

view this post on Zulip Tobias Fritz (May 06 2022 at 06:53):

Nathaniel Virgo said:

One thing to note is that algebras of probability monads are not necessarily expectations, as there are algebras you can construct from partial orders that are not of that form. (See Fritz' paper on convex spaces, for example.) So if you really wanted to characterise expectations you'd have to add some extra axioms to get rid of those.

That point I disagree with. If all you want to do is take expectations of finitely supported probability measures, then your monad is the convex combinations monad on Set, and you can take expectations in all of its algebras, including the weird ones! "Expectation" and "algebra map" are the same concept, and I don't see any need for further caveats.

Perhaps surprisingly, some of the probability monads in measure-theoretic probability do not even have any weird algebras at all. For example, the monad of Radon probability measures on compact Hausdorff spaces is exactly such that its algebras are the compact convex sets in vector spaces. (More precisely, in locally convex topological vector spaces.) This is a result of Swirszcz.

view this post on Zulip Tobias Fritz (May 06 2022 at 06:59):

One more thing: although our current tentative treatment of the law of large numbers lives at the level of free algebras, I suspect that it automatically implies the analogous statement for all algebras. In case that it could help, we could share our unfinished manuscript over email.

view this post on Zulip Paolo Perrone (May 06 2022 at 08:13):

Excellent answers here. I'd just like to point out that in the paper on representable Markov categories already mentioned, Section 4, there is a discussion involving algebras as "expectation", especially in the context of conditional expectation.

view this post on Zulip Moritz Schauer (May 06 2022 at 09:05):

One perhaps useful functional analytic characterisation of expectations is that they are positive linear functionals on test function spaces,

view this post on Zulip Nathaniel Virgo (May 06 2022 at 09:11):

Tobias Fritz said:

Nathaniel Virgo said:

One thing to note is that algebras of probability monads are not necessarily expectations, as there are algebras you can construct from partial orders that are not of that form. (See Fritz' paper on convex spaces, for example.) So if you really wanted to characterise expectations you'd have to add some extra axioms to get rid of those.

That point I disagree with. If all you want to do is take expectations of finitely supported probability measures, then your monad is the convex combinations monad on Set, and you can take expectations in all of its algebras, including the weird ones! "Expectation" and "algebra map" are the same concept, and I don't see any need for further caveats.

This is probably a terminology thing, but here's what I meant in more detail. Consider the finitely supported distribution monad, and consider the object R\mathbb{R}. There are many algebras we can put on this object. Many of them are expectations, in that they have the form f1(ipif(xi))f^{-1}\left(\sum_i p_i f(x_i)\right) for some invertible function f ⁣:RRf\colon \mathbb{R}\to\mathbb{R}. But there are some that aren't, such as the one that maps pPRp\in P\mathbb{R} to the maximum of the support of pp. This seems like a limiting case of an expectation, while not being an expectation itself, at least not in the sense I'd usually use that word.

More generally, algebras of the distribution monad are convex spaces, and some convex spaces behave like convex subsets of Rn\mathbb{R}^n, while others don't. Regardless of whether we use the word expectation for the ones that have this kind of property, it would be interesting to know what extra axioms are needed to rule out the "weird" ones that lack it, for general probability monads rather than just the distribution monad.

Of course a lot of results that apply to what I'm calling expectations will also apply to more general algebras of probability monads. But one place it might matter is in defining exponential families, because these are closely related to the notion of expectation. My hunch is that the notion of expectation will appear in their definition, and if you generalise it to a more general algebra you'll a more general class of families of distributions. But I never did quite see how to do that.

view this post on Zulip Tobias Fritz (May 06 2022 at 10:16):

I guess my point is that I don't see why one wouldn't want to call the algebra map of a "weird" algebra an expectation. Mathematical terminology is often chosen in such a way that the resulting concepts have counterintuitive instances. A good guiding principle in choosing terminology is to ensure that the theorems that one wants can be proven, and in such a way that those attain reasonable generality and one does not make irrelevant assumptions. With this in mind, it could be helpful to see which interesting theorem (or construction) fails for the weird ones.

If the construction of exponential families works well for R\mathbb{R} with its usual algebra structure, but gives something else for the weird (or tropical) version of R\mathbb{R}, then maybe this could even be a feature rather than a bug?

view this post on Zulip Tobias Fritz (May 06 2022 at 10:28):

Moritz Schauer said:

One perhaps useful functional analytic characterisation of expectations is that they are positive linear functionals on test function spaces,

Right. In that kind of approach, one usually considers only expectations of real-valued (or complex-valued) random variables. But there are much more general cases that are of interest! For example, one may want to consider a random measure (=a measure on a space of probability measures) and form its expectation, which then again is a probability measure. Of course one can reduce this to the real-valued case; but a nice feature of categorical probability is that one acquires a more abstract picture in which one doesn't need to do such a reduction.

view this post on Zulip Moritz Schauer (May 06 2022 at 14:49):

Not only real random variables, but functionals of random variables. If you have an initial morphism ψ\psi and an "effect" hh, then ψh\psi h = "hdψ\int h d \psi" = "Ef(X)\mathbb E f(X) with Xψ X \sim \psi"

view this post on Zulip Tobias Fritz (May 06 2022 at 15:12):

Right. My terminology was probably a bit confusing; what I meant by "random variable" is exactly your hh. I meant to indicate that although the measures-as-functionals idea is nice, from the perspective of Markov categories it's not part of the fundamental structure of probability but more of an irrelevant implementation detail.

view this post on Zulip Moritz Schauer (May 09 2022 at 08:22):

Ah, makes perfect sense