Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: event: Categorical Probability and Statistics 2020 workshop

Topic: Jun 8: Alex Simpson's talk

Paolo Perrone (Jun 04 2020 at 19:17):

Hey all,
This is the discussion thread of Alex Simpson's talk, "Synthetic probability theory".
The talk, besides being on Zoom, is livestreamed here: https://youtu.be/wsSpaIWqszQ

Paolo Perrone (Jun 04 2020 at 19:22):

Date and time: Monday, 8 Jun, 12h UTC.

Paolo Perrone (Jun 08 2020 at 11:40):

Hello all! We start in 20 minutes.

Arthur Parzygnat (Jun 08 2020 at 12:19):

This is nice, I initially heard an earlier version of some of these ideas from Tao’s blog. This functor viewpoint seems to encode those ideas quite nicely. I'll post a link when I find it... Update: Okay the link is https://terrytao.wordpress.com/2010/01/01/254a-notes-0-a-review-of-probability-theory/ and the notion is that of an extension of a random variable. Alex has a paper developing this in the language of sheaves here.

Paolo Perrone (Jun 08 2020 at 12:27):

Yep, this is very very nice!

Robert Furber (Jun 08 2020 at 12:51):

:clap: (sorry can't zoom)

Bas Spitters (Jun 08 2020 at 12:59):

@Alex Simpson Just to repeat it here. What are your plans for a computable development?

Manuel Bärenz (Jun 08 2020 at 13:01):

@Alex Simpson

Maybe a beginner question: By having only sets, do you somehow lose the ability to treat random variables with topology? E.g. real valued ones?

Also, what about probabilities that can't be generated from a random coin toss?

Dario Stein (Jun 08 2020 at 13:04):

@Alex Simpson Thanks for the great talk -- are there any notes/papers published on your developments?

To flesh out my comment about random sets. I've showed that in quasi-Borel spaces, the (2^R)-valued random variables {X} and {} have the same law, when X is sampled from a continuous distribution on R. In particular there can be no measurable way of checking if a Borel set is empty. I came to consider this a feature instead of a bug, as it has a nice interpretation in terms of privacy/information hiding of the value of X. I wondered what happens in your theory in this example.

Alex Simpson (Jun 08 2020 at 13:58):

Bas Spitters said:

Alex Simpson Just to repeat it here. What are your plans for a computable development?

Thank you for the question, Bas. I very much like the general approach to dealing with "probability measures" constructively/computably that you talked about yesterday. I wonder if this can be combined with an axiomatic notion of random variable, similar to that in my talk. So probability measures on the powerset PX would be replaced by continuous probability valuations on X -> S (where S is your Sierpinski object, I don't remember what notation you used).

(I would prefer to have probability valuations than subprobability as basic and derive subprobability valuations as probability valuations on a "lifted" space. )

Tobias Fritz (Jun 08 2020 at 14:00):

Alex's slides are now here (and many other ones are also downloadable from the schedule).

Alex Simpson (Jun 08 2020 at 14:09):

Manuel Bärenz said:

Alex Simpson

Maybe a beginner question: By having only sets, do you somehow lose the ability to treat random variables with topology? E.g. real valued ones?

Thanks for the questions, Manuel. Regarding topology, one indeed has a notion of random variable in a set irrespective of whether the set comes with a topology. However, in the case that you do have a topological space, it still makes sense to ask questions about how the probability laws of random variables relate to the topology. I actually use this possibility in what I call the "near Borel axiom", for standard Borel spaces probability laws on the powerset end up being determined by their values on Borel sets, which are (in one way of looking at things) derived from a topology. In particular, I think the treatment of real-valued random variables goes through perfectly smoothly.

Tobias Fritz (Jun 08 2020 at 14:10):

This has been super interesting! I'd have two further questions if you don't mind.

Do you know how much of the theory that you presented can still be developed without excluded middle or dependent choice, with an appropriate choice of definitions? (E.g. concerning which version of the reals to use for measures to take values in.)
You briefly mentioned that the random variables functor does not preserve arbitrary products. Your slides seem to be saying that this is because there are distinguishable processes which are modifications of each other. Does this follow from your axioms alone? What would be an example? (I imagine that this is standard in conventional stochastic process theory and I just don't know of it.,, It seems related to our discussion on extension theorems.)

Alex Simpson (Jun 08 2020 at 14:19):

Manuel Bärenz said:

Alex Simpson

Also, what about probabilities that can't be generated from a random coin toss?

This is a real limitation of my approach. I rely on the "universality of $\lambda$ " axiom for many things. It is well known that all Borel probability measures on standard Borel spaces are obtained as pushforwards of $\lambda$ along measurable functions. So all such measures are included, which is a lot. However, I have no means, for example, of dealing with something like Haar probability measures on non-separable compact groups. (If one does want to include such measures, the whole framework probably needs re-thinking, including the decision to reject the axiom of choice.)

Alex Simpson (Jun 08 2020 at 14:37):

Arthur Parzygnat said:

This is nice, I initially heard an earlier version of some of these ideas from Tao’s blog. This functor viewpoint seems to encode those ideas quite nicely. I'll post a link when I find it... Update: Okay the link is https://terrytao.wordpress.com/2010/01/01/254a-notes-0-a-review-of-probability-theory/ and the notion is that of an extension of a random variable. Alex has a paper developing this in the language of sheaves here.

Hi Arthur, thanks for the excellent chairing, and for making the connection to Tao's blog. Well connected! Indeed, this whole development has been greatly influenced by that blog post. I began by refining Tao's "preservation under extension", which is a presheaf property, to "preservation under extension and restriction", which is a sheaf property. This produces a boolean topos of what I call "probability sheaves". The topos model of the axiomatisation I gave today is obtained by constructing the Grothendieck topos of probability sheaves relative to a topos model of "all subsets of $\mathbb{R}$ are measurable".

Unfortunately, I still haven't finished writing any of this up. For those who would like more details about "probability sheaves", a talk I gave on them at the IHES workshop on topos theory in 2015 is on Youtube here

Alex Simpson (Jun 08 2020 at 14:54):

Tobias Fritz said:

This has been super interesting! I'd have two further questions if you don't mind.

Do you know how much of the theory that you presented can still be developed without excluded middle or dependent choice, with an appropriate choice of definitions? (E.g. concerning which version of the reals to use for measures to take values in.)

Thanks Tobias! I make use of excluded middle and dependent choice everywhere. Excluded middle is needed to talk about (ordinary Dedekind)-real-valued probability measures on powersets, as well as for lots of the reasoning. Dependent choice plays a fundamental role in constructing families of independent and also interdependent random variables (in addition to the iid sequences I considered in the talk, it is e.g. used in constructing Brownian motion).

Nevertheless, I do think of constructive/computable probability theory as very interesting, and for that one of course needs to drop excluded middle at least. As in my reply to @Bas Spitters , I very much like his approach to choice-free constructive probability theory. I think it would be interesting to investigate having an abstract account of random variables in that setting.

I shall postpone your second question (along with that of @Dario Stein , which is related) until after the discussion session.

Paolo Perrone (Jun 08 2020 at 18:45):

Hi all! Here's the video.
https://youtu.be/XtsBsLM9ofk

Alex Simpson (Jun 08 2020 at 19:32):

Dario Stein said:

Alex Simpson Thanks for the great talk -- are there any notes/papers published on your developments?

To flesh out my comment about random sets. I've showed that in quasi-Borel spaces, the (2^R)-valued random variables {X} and {} have the same law, when X is sampled from a continuous distribution on R. In particular there can be no measurable way of checking if a Borel set is empty. I came to consider this a feature instead of a bug, as it has a nice interpretation in terms of privacy/information hiding of the value of X. I wondered what happens in your theory in this example.

Thank you Dario for this interesting example. It is indeed possible to define these two "random sets" as random variables in my setting, that is as elements of $RV(2^\mathbb{R})$ . These random variables are different. More importantly their probability laws are different, for the reason one would expect. The first has probability $0$ of being the empty set. The second has probability $1$ . There is no issue with this distinction in my setting, and measurability concerns cannot be a source of trouble. For example, it is not the case in my theory that all functions between standard Borel spaces are Borel measurable.

Presumably in qBS, you consider the above random sets as random elements in $P(2^\mathbb{R})$ . Is the issue that arises in qBS something like the following? In $P(2^\mathbb{R})$ , as defined in the qBS way, is it the case that the underlying $\sigma$ -algebra, using which random elements are compared for equivalence, contains only the emptyset and the whole set? (It seems to me that something like this might occur due to the fact that Borel sets are not closed under projection. But I haven't thought it through.) I'm guessing a bit here, and I'd be very interested to hear from you what the actual situation is.

Regarding written material, I am afraid I don't have anything available yet. This is quite a priority now.

Dario Stein (Jun 08 2020 at 20:22):

Alex Simpson said:

Dario Stein said:

Alex Simpson Thanks for the great talk -- are there any notes/papers published on your developments?

To flesh out my comment about random sets. I've showed that in quasi-Borel spaces, the (2^R)-valued random variables {X} and {} have the same law, when X is sampled from a continuous distribution on R. In particular there can be no measurable way of checking if a Borel set is empty. I came to consider this a feature instead of a bug, as it has a nice interpretation in terms of privacy/information hiding of the value of X. I wondered what happens in your theory in this example.

Thank you Dario for this interesting example. It is indeed possible to define these two "random sets" as random variables in my setting, that is as elements of $RV(2^\mathbb{R})$ . These random variables are different. More importantly their probability laws are different, for the reason one would expect. The first has probability $0$ of being the empty set. The second has probability $1$ . There is no issue with this distinction in my setting, and measurability concerns cannot be a source of trouble. For example, it is not the case in my theory that all functions between standard Borel spaces are Borel measurable.

Presumably in qBS, you consider the above random sets as random elements in $P(2^\mathbb{R})$ . Is the issue that arises in qBS something like the following? In $P(2^\mathbb{R})$ , as defined in the qBS way, is it the case that the underlying $\sigma$ -algebra, using which random elements are compared for equivalence, contains only the emptyset and the whole set? (It seems to me that something like this might occur due to the fact that Borel sets are not closed under projection. But I haven't thought it through.) I'm guessing a bit here, and I'd be very interested to hear from you what the actual situation is.

Regarding written material, I am afraid I don't have anything available yet. This is quite a priority now.

You're right, Borel sets not being closed under projection plays a role -- the emptyness check is not a morphism in qbs. The $\sigma$ -algebra consists precisely of the qbs morphisms $2^\R \to 2$ , of which there are many useful examples. You can test for membership, take s-finite measures of your set etc... The precise condition is known as "Borel on Borel" in the literature. But we managed to show that such morphisms can't distinguish the emptyset from almost all singletons.

Alex Simpson (Jun 08 2020 at 20:25):

Tobias Fritz said:

You briefly mentioned that the random variables functor does not preserve arbitrary products. Your slides seem to be saying that this is because there are distinguishable processes which are modifications of each other. Does this follow from your axioms alone? What would be an example? (I imagine that this is standard in conventional stochastic process theory and I just don't know of it.,, It seems related to our discussion on extension theorems.)

Yes this does follow from my axioms. I am going to use a version of the example serendipitously provided by @Dario Stein in his question. In a conventional stochastic process formulation this example consists of two processes: $\{X\}$ is the process (using Kronecker delta)

$\omega,y \mapsto \delta_\omega(y) \colon [0,1] \times \mathbb{R} \to \{0,1\}$

where $[0,1]$ is the sample space with uniform distribution. And $\{\}$ is the process

$\omega,y \mapsto 0 \colon [0,1] \times \mathbb{R} \to \{0,1\}$

These two processes are modifications of each other: for every $y$ , the random variables $\omega \mapsto \{X\}(\omega, y)$ and $\omega \mapsto \{\}(\omega,y)$ are almost surely equal (the latter is the constant map to $0$ ). But they are not indistinguishable: the probability, for a randomly selected $\omega$ that the functions $y \mapsto \{X\}(\omega, y)$ and $y \mapsto \{\}(\omega,y)$ are equal is not $1$ - in fact it is $0$ .

These processes are easily definable in my setting from a uniform random variable $X \in RV[0,1]$ . Namely, $\{X\}$ is obtained by transporting $X$ along $(x \mapsto \lambda y.\, \delta_x(y)) : [0,1] \to \{0,1\}^\mathbb{R}$ . And $\{\} \in RV( \{0,1\}^\mathbb{R})$ is simply the Dirac-delta for $\lambda y.\,0$ .

Due to the above observations about the processes being distinguishable modifications of each other, it holds that $\{X\} \neq \{\} \in RV( \{0,1\}^\mathbb{R})$ , but the natural map from $RV( \{0,1\}^\mathbb{R})$ to $RV( \{0,1\})^\mathbb{R}$ identifies them.

Alex Simpson (Jun 08 2020 at 21:01):

Dario Stein said:

You're right, Borel sets not being closed under projection plays a role -- the emptyness check is not a morphism in qbs. The $\sigma$ -algebra consists precisely of the qbs morphisms $2^\R \to 2$ , of which there are many useful examples. You can test for membership, take s-finite measures of your set etc... The precise condition is known as "Borel on Borel" in the literature. But we managed to show that such morphisms can't distinguish the emptyset from almost all singletons.

Thank you for the explanation, Dario! That is more subtle than I thought. I agree with you it sounds like a very interesting feature of qBS rather than a bug. On the other hand, distinguishing the two examples in my setting is also not a bug, as it corresponds to what happens if they are considered as stochastic processes, as in my reply to @Tobias Fritz .

Tobias Fritz (Jun 08 2020 at 21:26):

Thanks, Alex! That's very useful input and I'll have to let that sink in for a while (even if it's not specific to what you're doing). Independently of that, I'm also very much looking forward to reading your paper when it's ready.

Oscar Cunningham (Jun 09 2020 at 14:23):

@Alex Simpson I just watched the video of the talk and found it very interesting. Thank you!
One thing I noticed was that you could phrase what you call 'functional dependence' in terms of the category of elements of the functor $RV$ . This category has all random variables as its objects, and a unique morphism from $X$ to $Y$ if and only if $Y$ is functionally dependent on $X$ (so it is in fact a poset). Another interesting thing about this category is that products give joint distributions.

Alex Simpson (Jun 09 2020 at 15:07):

Oscar Cunningham said:

Alex Simpson I just watched the video of the talk and found it very interesting. Thank you!
One thing I noticed was that you could phrase what you call 'functional dependence' in terms of the category of elements of the functor $RV$ . This category has all random variables as its objects, and a unique morphism from $X$ to $Y$ if and only if $Y$ is functionally dependent on $X$ (so it is in fact a poset). Another interesting thing about this category is that products give joint distributions.

Thank you for your comments, Oscar. I hadn't thought about the category of elements in this context before. I like the observation that joint distributions are products. A small correction: the uniqueness statement is not true in general. Given $X \in RV(A)$ and $Y \in RV(B)$ there may be many functions $f: A \to B$ for which $f(X) = Y$ . Uniqueness holds in the sense that any two functions will be $\mathbf{P}_X$ -almost-everywhere equal. But for functions $A \to B$ almost everywhere equality is not the same as equality (they are functions and the notion of equality for them is predetermined by the background set/type theory).