You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Hey all,
This is the discussion thread of my talk, "Probability monads and stochastic dominance".
The talk, besides being on Zoom, is livestreamed here: https://youtu.be/kKDMCDUaxxE
Date and time: Saturday, 6 Jun, 15h UTC.
Hi! We start in 30 minutes.
Great talk.
Can you say something about the bar construction mentioned in the abstract?
Do these ideas have anything to say about deciding between two investments p, q with nonzero probability of both p<=q and p>q, on the basis of expected return and risk?
Yes, great talk!
Is there a formal formulation for this process of "moving the mass"? Maybe pullback along projetion and then pushforward?
I am wondering what is the main reason for considering order isomorphisms rather than homomorphisms for the complete metric space case? That is, why do we exclude coarse-grainings (of the order information)? Is (one) reason that we would be mixing up the first-order and second-order dominance (since the latter is related to coarse-grainings)?
A more exotic question: Is there a chance to have a unified treatment (which is still sufficiently expressive) of both first-order and second-order dominance at some level of abstraction? There are parallels between them, but also some important differences.
Joscha Diehl said:
Great talk.
Can you say something about the bar construction mentioned in the abstract?
@Joscha Diehl Thanks! Yep. The maps and which give the equivalent characterization to second-order stochastic dominance are exactly the source and target maps of 1-simplices in the bar construction. This turns out to be quite structural!
There's quite a bit to talk about - if you want I can expand. For now, two references are Chapter 4 of my thesis, and my MFPS talk here: https://youtu.be/28EASeG1RBA (paper here: https://arxiv.org/abs/1810.06037)
By the way, here's the link to my thesis: http://paoloperrone.org/phdthesis.pdf
Joshua Meyers said:
Do these ideas have anything to say about deciding between two investments p, q with nonzero probability of both p<=q and p>q, on the basis of expected return and risk?
If I understand you correctly, yes. So one can prove that if we are over a -algebras, for example real numbers, then if is lower than in the stochastic order, has lower expectation value than - this expectation map is even strictly monotone.
For what concerns risk, the second-order stochastic dominance is precisely the "riskiness". One can also use the two orders together, and that's sometimes also called second-order stochastic dominance. I can expand on this, if you want. In any case if you want a reference, this is worked out in my thesis, the link is above.
Peter Arndt said:
Yes, great talk!
Is there a formal formulation for this process of "moving the mass"? Maybe pullback along projetion and then pushforward?
Yes, but in general it's a random transport map, not a deterministic one. In the sense that the mass over a certain point may be split, and sent to two different points.
Tomáš Gonda said:
I am wondering what is the main reason for considering order isomorphisms rather than homomorphisms for the complete metric space case? That is, why do we exclude coarse-grainings (of the order information)? Is (one) reason that we would be mixing up the first-order and second-order dominance (since the latter is related to coarse-grainings)?
Oh, we are not using only order isomorphisms, but homomorphisms. We can coarse-grain all we want. Only the unit of the monad (the Dirac delta) happens to be an order-embedding, everything else is not. Sorry if that wasn't clear!
Tomáš Gonda said:
A more exotic question: Is there a chance to have a unified treatment (which is still sufficiently expressive) of both first-order and second-order dominance at some level of abstraction? There are parallels between them, but also some important differences.
Yep, there is, unfortunately I didn't have time to talk about that. It even satisfies a very nice universal property, it's a "lax codescent object". A way to obtain this composite order is to take the coinserter of the two maps . I can expand on this, if you want.
Paolo Perrone said:
Tomáš Gonda said:
I am wondering what is the main reason for considering order isomorphisms rather than homomorphisms for the complete metric space case? That is, why do we exclude coarse-grainings (of the order information)? Is (one) reason that we would be mixing up the first-order and second-order dominance (since the latter is related to coarse-grainings)?
Oh, we are not using only order isomorphisms, but homomorphisms. We can coarse-grain all we want. Only the unit of the monad (the Dirac delta) happens to be an order-embedding, everything else is not. Sorry if that wasn't clear!
hm, ok, so what was the extra property of morphisms in COMet besides being order-preserving and 1-Lipschitz?
Paolo Perrone said:
Tomáš Gonda said:
A more exotic question: Is there a chance to have a unified treatment (which is still sufficiently expressive) of both first-order and second-order dominance at some level of abstraction? There are parallels between them, but also some important differences.
Yep, there is, unfortunately I didn't have time to talk about that. It even satisfies a very nice universal property, it's a "lax codescent object". A way to obtain this composite order is to take the coinserter of the two maps . I can expand on this, if you want.
That sounds intriguing (although probably slightly different to what I actually had in mind). Does it give you the "combined" order (i.e. the intersection of the two order relations)? I would love to know more about that, is it in some of your articles?
Tomáš Gonda said:
Paolo Perrone said:
Tomáš Gonda said:
I am wondering what is the main reason for considering order isomorphisms rather than homomorphisms for the complete metric space case? That is, why do we exclude coarse-grainings (of the order information)? Is (one) reason that we would be mixing up the first-order and second-order dominance (since the latter is related to coarse-grainings)?
Oh, we are not using only order isomorphisms, but homomorphisms. We can coarse-grain all we want. Only the unit of the monad (the Dirac delta) happens to be an order-embedding, everything else is not. Sorry if that wasn't clear!
hm, ok, so what was the extra property of morphisms in COMet besides being order-preserving and 1-Lipschitz?
That's not on the morphisms, it's on the orders. It's definition 4.1.1 in this paper: https://arxiv.org/abs/1808.09898
It says that 1-Lipschitz monotone functions are "enough to tell the order", the order is never too "steep". Example 4.1.2 there gives a space which does not satisfy that property.
By the way, we called those spaces "L-ordered" in honor of Lawvere - I should have said that since he was in the audience!
Tomáš Gonda said:
Paolo Perrone said:
Tomáš Gonda said:
A more exotic question: Is there a chance to have a unified treatment (which is still sufficiently expressive) of both first-order and second-order dominance at some level of abstraction? There are parallels between them, but also some important differences.
Yep, there is, unfortunately I didn't have time to talk about that. It even satisfies a very nice universal property, it's a "lax codescent object". A way to obtain this composite order is to take the coinserter of the two maps . I can expand on this, if you want.
That sounds intriguing (although probably slightly different to what I actually had in mind). Does it give you the "combined" order (i.e. the intersection of the two order relations)? I would love to know more about that, is it in some of your articles?
Oh, I think I see what you meant in your original question. I'd love to find a uniform treatment - so far we weren.t able to find it except pretty much for what I said during the talk. I suspect one can do this in Markov categories with conditionals though.
If you want to know more about the combined order, see Section 4.3 of my thesis (link above). Hopefully it will become a paper soon.
Paolo Perrone said:
That's not on the morphisms, it's on the orders. It's definition 4.1.1 in this paper: https://arxiv.org/abs/1808.09898
It says that 1-Lipschitz monotone functions are "enough to tell the order", the order is never too "steep". Example 4.1.2 there gives a space which does not satisfy that property.
oh, that makes a lot of sense, thanks! I must have had a brief attention glitch while you were explaining it :sleeping:
Paolo Perrone said:
Oh, I think I see what you meant in your original question. I'd love to find a uniform treatment - so far we weren.t able to find it except pretty much for what I said during the talk. I suspect one can do this in Markov categories with conditionals though.
If you want to know more about the combined order, see Section 4.3 of my thesis (link above). Hopefully it will become a paper soon.
I guess one way to interpret what I'd like to have from my (biased) perspective is finding a class of resource theories that would have enough structure to allow one to prove classification theorems about their resource orderings and which would include, as special cases, the first-order and second-order stochastic dominance among other preorders.
Paolo Perrone said:
Joshua Meyers said:
Do these ideas have anything to say about deciding between two investments p, q with nonzero probability of both p<=q and p>q, on the basis of expected return and risk?
If I understand you correctly, yes. So one can prove that if we are over a -algebras, for example real numbers, then if is lower than in the stochastic order, has lower expectation value than - this expectation map is even strictly monotone.
For what concerns risk, the second-order stochastic dominance is precisely the "riskiness". One can also use the two orders together, and that's sometimes also called second-order stochastic dominance. I can expand on this, if you want. In any case if you want a reference, this is worked out in my thesis, the link is above.
Yes but if the events p<=q and p>q both have nonzero probability, then neither has first-order stochastic dominance over the other, as you have defined it. And second-order stochastic dominance doesn't even apply, as they are not about "the same data".
Yes p<=q in the stochastic order implies that E(p)<=E(q), but not conversely.
Update: I should have said that the events p<q and q<p both have nonzero probability, sorry for the confusion.
Joshua Meyers said:
Paolo Perrone said:
Joshua Meyers said:
Do these ideas have anything to say about deciding between two investments p, q with nonzero probability of both p<=q and p>q, on the basis of expected return and risk?
If I understand you correctly, yes. So one can prove that if we are over a -algebras, for example real numbers, then if is lower than in the stochastic order, has lower expectation value than - this expectation map is even strictly monotone.
For what concerns risk, the second-order stochastic dominance is precisely the "riskiness". One can also use the two orders together, and that's sometimes also called second-order stochastic dominance. I can expand on this, if you want. In any case if you want a reference, this is worked out in my thesis, the link is above.Yes but if the events p<=q and p>q both have nonzero probability, then neither has first-order stochastic dominance over the other, as you have defined it. And second-order stochastic dominance doesn't even apply, as they are not about "the same data".
Yes p<=q in the stochastic order implies that E(p)<=E(q), but not conversely.
Update: I should have said that the events p<q and q<p both have nonzero probability, sorry for the confusion.
Uhm, I think I don't understand. There are (at least) 2 orders that one can construct, first and second order stochastic dominance, and their composite order as well. What is the question exactly, possibly in other words?
Perhaps Joshua is talking about the pointwise order on random variables that Paolo spoke about at the beginning? Joshua, what do you mean by the probability of the event p<=q?
If and are just probability measures on an ordered space, then it makes sense to try and compare them with respect to first-order stochastic dominance, even if no joint distribution is given, meaning that even if we cannot talk about the probability of "the event ". So you should think of it like this: if someone offers you a bet with 50:50 odds and someone else offers you one with 80:20 odds against you, then you'd rather choose the 50:50 one, right? (Assuming that the stakes are the same.) That's because the 50:50 is higher up in first-order stochastic dominance. And to make this decision, there's no need to know how the two bets are correlated or whether they're independent! It's merely a relation involving pairs of distributions.
Hi all! Here's the video.
https://youtu.be/auIuhRjMokQ
By the way, somebody was asking whether the stochastic order can be equivalently obtained by comparing quantiles: the answer is, yes if X = R, while in the general case of a partial order one cannot define quantiles. (Thanks to Sharwin Rezagholi for pointing that out!)
I am thinking that and are random variables, perhaps is the return on investing in bananas and is the return on investing in oranges. Suppose that the sets and both have nonzero measure. Then by your definition, we cannot say that either or has first-order stochastic dominance over the other. (I am recalling that you define to have stochastic dominance over iff .) Is there a way to still compare them in this case? (Tobias's example of the two uncorrelated bets is an example of this.)
Thank you very much for the effort to explain clearly and comprehensibly, @Paolo Perrone . As a non-expert I deeply appreciate that, and I really enjoyed your talk, though much of it is way over my head. At least I feel your work is not completely out of reach for me, and that makes me very happy!
Please take the following as a gentle suggestion and not as criticism. How about occasionally replacing examples using goods and gains by examples using, e.g., biological traits and fitness (these happen to be the constructs I'm mostly interested in :see_no_evil:)? Traits can be anything from a cell's production rate of a protein, to the kind of care given by a parent to its offspring, and to the ploidy of the genetic system of the organism (multiplicity of chromosomes) or the pinkness of unicorns. Fitness usually represents some sort of success in reproduction or persistence that, crucially, does not necessarily occur at the expense of other entities (I am aware this may be possible also in economic contexts). IMHO this setting provides much nicer stories. I apologise if this suggestion is inappropriate.
Joshua Meyers said:
I am thinking that and are random variables, perhaps is the return on investing in bananas and is the return on investing in oranges. Suppose that the sets and both have nonzero measure. Then by your definition, we cannot say that either or has first-order stochastic dominance over the other. (I am recalling that you define to have stochastic dominance over iff .) Is there a way to still compare them in this case? (Tobias's example of the two uncorrelated bets is an example of this.)
I see. If I understand correctly, if and come from random variables, and their joint assigns nonzero probability both to and , then they are not comparable as random variables. However, it could still be that there exists some other joint entirely supported on the order relation - in that case and are comparable as random variables (in the stochastic order). It turns out that in many cases (such as over the real line) the stochastic order is a partial order, so if there is a joint assigning probability 1 to then there is no joint assigning probability 1 to . (A reference for the latter, for example, is this paper of Tobias, https://arxiv.org/abs/1810.06771.)
Christoph Thies said:
Thank you very much for the effort to explain clearly and comprehensibly, Paolo Perrone . As a non-expert I deeply appreciate that, and I really enjoyed your talk, though much of it is way over my head. At least I feel your work is not completely out of reach for me, and that makes me very happy!
Please take the following as a gentle suggestion and not as criticism. How about occasionally replacing examples using goods and gains by examples using, e.g., biological traits and fitness (these happen to be the constructs I'm mostly interested in :see_no_evil:)? Traits can be anything from a cell's production rate of a protein, to the kind of care given by a parent to its offspring, and to the ploidy of the genetic system of the organism (multiplicity of chromosomes) or the pinkness of unicorns. Fitness usually represents some sort of success in reproduction or persistence that, crucially, does not necessarily occur at the expense of other entities (I am aware this may be possible also in economic contexts). IMHO this setting provides much nicer stories. I apologise if this suggestion is inappropriate.
Thank you for the appreciation!
It would be nice to use examples from biology. Unfortunately I don't know enough about it though. Do you have a particular example in mind, that you'd like to explain to me?
By the way, some people asked which program I used as virtual blackboard. It's called Xournal, http://xournal.sourceforge.net/.
@Paolo Perrone You showed that conditional expectations on X can be phrased as partial evaluation/double distributions in the case where X is convex. Do you know how this could connect to other synthetic notions of conditioning (assuming our Markov category comes from a monad)? What if X is not convex, e.g. can we somehow express conditioning on boolean outputs?
Paolo Perrone said:
Christoph Thies said:
Thank you very much for the effort to explain clearly and comprehensibly, Paolo Perrone . As a non-expert I deeply appreciate that, and I really enjoyed your talk, though much of it is way over my head. At least I feel your work is not completely out of reach for me, and that makes me very happy!
Please take the following as a gentle suggestion and not as criticism. How about occasionally replacing examples using goods and gains by examples using, e.g., biological traits and fitness (these happen to be the constructs I'm mostly interested in :see_no_evil:)? Traits can be anything from a cell's production rate of a protein, to the kind of care given by a parent to its offspring, and to the ploidy of the genetic system of the organism (multiplicity of chromosomes) or the pinkness of unicorns. Fitness usually represents some sort of success in reproduction or persistence that, crucially, does not necessarily occur at the expense of other entities (I am aware this may be possible also in economic contexts). IMHO this setting provides much nicer stories. I apologise if this suggestion is inappropriate.
Thank you for the appreciation!
It would be nice to use examples from biology. Unfortunately I don't know enough about it though. Do you have a particular example in mind, that you'd like to explain to me?
Hm, I don't have a specific example. In fact, wrt your talk I think your examples were perfectly appropriate. It just occurred to me this might be a useful suggestion since biology instantiates these abstract matters in ways quite different from economy.
Personally, I am interested for example in biological scenarios in which taking the codomain of random variables that represent, say, traits to be the real numbers is not adequate so that classical statistics becomes sort of dodgy as a language for reasoning about them (and I am here in order to understand better what I mean by this and other things I don't quite understand).
Dario Stein said:
Paolo Perrone You showed that conditional expectations on X can be phrased as partial evaluation/double distributions in the case where X is convex. Do you know how this could connect to other synthetic notions of conditioning (assuming our Markov category comes from a monad)? What if X is not convex, e.g. can we somehow express conditioning on boolean outputs?
That's a very good question! We are currently working on it actually (Tobias, Eigil, Tomáš and I), we suspect that you can.
For non-convex things, when one conditions, one is usually doing one of these two things: either conditioning a function on into a convex set (say, US states to political preferences), or, working on the free convex space .
@Paolo Perrone I don't understand your example in the beginning. Why is the white curve better than the yellow?
(Following up from in-person conversation: the intuition is that the white curve is both "higher up", so that you are likely to gain more, and "more peaked", so you have a better idea of how much exactly you will gain - which can be useful for planning a strategy.)
Paolo Perrone said:
Dario Stein said:
Paolo Perrone You showed that conditional expectations on X can be phrased as partial evaluation/double distributions in the case where X is convex. Do you know how this could connect to other synthetic notions of conditioning (assuming our Markov category comes from a monad)? What if X is not convex, e.g. can we somehow express conditioning on boolean outputs?
That's a very good question! We are currently working on it actually (Tobias, Eigil, Tomáš and I), we suspect that you can.
For non-convex things, when one conditions, one is usually doing one of these two things: either conditioning a function on into a convex set (say, US states to political preferences), or, working on the free convex space .
Fantastic, very interested by that. Sam's and my Beta-Bernoulli monad encodes essentially the conjugate prior relationship of the two distributions and nothing more; I'd be keen to work out if we can formalize conditioning on the Bernoulli flips in a synthetic way (the main difficulty so far was the central position of if/coproducts in the language, which Markov categories still don't reflect well).
Dario Stein said:
Fantastic, very interested by that. Sam's and my Beta-Bernoulli monad encodes essentially the conjugate prior relationship of the two distributions and nothing more; I'd be keen to work out if we can formalize conditioning on the Bernoulli flips in a synthetic way (the main difficulty so far was the central position of if/coproducts in the language, which Markov categories still don't reflect well).
I'd like to understand what you mean in a little more detail, if you could elaborate or point me to a reference. For example, what exactly is the Beta-Bernoulli monad? Perhaps this paper of yours? And is what you're talking about related to Jacobs' work on conjugate priors [Ref: https://arxiv.org/abs/1707.00269]? From my understanding of his paper, he had a nice description of conjugate priors in the Markov category language.
Paolo Perrone said:
Joscha Diehl said:
Great talk.
Can you say something about the bar construction mentioned in the abstract?Joscha Diehl Thanks! Yep. The maps and which give the equivalent characterization to second-order stochastic dominance are exactly the source and target maps of 1-simplices in the bar construction. This turns out to be quite structural!
There's quite a bit to talk about - if you want I can expand. For now, two references are Chapter 4 of my thesis, and my MFPS talk here: https://youtu.be/28EASeG1RBA (paper here: https://arxiv.org/abs/1810.06037)
Thanks! The paper should be a good entry point for me to get what's going on. I was thinking of the bar construction in algebra, that builds a coalgebra out of an algebra. I guess the categorial bar construction is an abstraction of that?
More conceptually: is the hope in understanding 'your' bar construction, to get to third-order dominance, etc?
Joscha Diehl said:
Paolo Perrone said:
Joscha Diehl said:
Great talk.
Can you say something about the bar construction mentioned in the abstract?Joscha Diehl Thanks! Yep. The maps and which give the equivalent characterization to second-order stochastic dominance are exactly the source and target maps of 1-simplices in the bar construction. This turns out to be quite structural!
There's quite a bit to talk about - if you want I can expand. For now, two references are Chapter 4 of my thesis, and my MFPS talk here: https://youtu.be/28EASeG1RBA (paper here: https://arxiv.org/abs/1810.06037)Thanks! The paper should be a good entry point for me to get what's going on. I was thinking of the bar construction in algebra, that builds a coalgebra out of an algebra. I guess the categorial bar construction is an abstraction of that?
More conceptually: is the hope in understanding 'your' bar construction, to get to third-order dominance, etc?
@Joscha Diehl I don't know if we get a coalgebra in this general case, it could be interesting! Anybody knows how this works?
In any case, here's roughly what happens. In the category of sets, the functor given by multiplying with a group (or monoid) has a natural monad structure, with unit and multiplication given by those of . It's called the action monad, or writer monad in computer science, https://ncatlab.org/nlab/show/action+monad
Algebra over this monad are -sets, i.e. sets with a -action. The very same holds in the category of Abelian groups if you take a ring and the functor given by tensoring with the ring. Algebras in this case are -modules.
Now it turns out that in order to define the bar construction, one does not need a group or a ring, the monad structure is enough. Outside the category of Abelian groups one gets a simplicial object, which in the Abelian case is "the same" as a chain complex by the Dold-Kan correspondence.
If you do that for probability monads, this simplicial object is such that 0-simplices are probability measures over a convex space (such as R), and 1-simplices are (basically) conditional expectations. Two measures are connected by an "edge" if one can be obtained by "concentrating" the other one (as we did in the US states example).
The question about third-order stochastic dominance is very interesting, I've always wondered how to obtain it in this framework, but I never figured it out. Do you have an idea in mind?
Arthur Parzygnat said:
Dario Stein said:
Fantastic, very interested by that. Sam's and my Beta-Bernoulli monad encodes essentially the conjugate prior relationship of the two distributions and nothing more; I'd be keen to work out if we can formalize conditioning on the Bernoulli flips in a synthetic way (the main difficulty so far was the central position of if/coproducts in the language, which Markov categories still don't reflect well).
I'd like to understand what you mean in a little more detail, if you could elaborate or point me to a reference. For example, what exactly is the Beta-Bernoulli monad? Perhaps this paper of yours? And is what you're talking about related to Jacobs' work on conjugate priors [Ref: https://arxiv.org/abs/1707.00269]? From my understanding of his paper, he had a nice description of conjugate priors in the Markov category language.
@Arthur Parzygnat yes, I meant that paper -- we didn't frame things very categorically there, but this amounts to giving a commutative&affine monad on the functor category [Fin,Set] (Sam mentioned this approach in his talk). Its Kleisli category is thus a minimalistic combinatorial Markov category that encodes beta+bernoulli+their conjugate relationship. Thanks for the reference to Bart's paper, I'll have a look at it.
@Paolo Perrone Thanks! Regarding third-order: no, I don't even know what third order dominance would mean. I'd hope it would pop out of the categorial view you have ..
Joscha Diehl said:
Paolo Perrone Thanks! Regarding third-order: no, I don't even know what third order dominance would mean. I'd hope it would pop out of the categorial view you have ..
My intuition would suggest, "the measure is more concentrated on lower values and more spread on higher values". But that's as much as I can say for now, nothing precise. (Imagine, on R, assigning a larger integral to functions such as f(x)=x^3.)
Paolo Perrone said:
Joscha Diehl said:
Paolo Perrone Thanks! Regarding third-order: no, I don't even know what third order dominance would mean. I'd hope it would pop out of the categorial view you have ..
My intuition would suggest, "the measure is more concentrated on lower values and more spread on higher values". But that's as much as I can say for now, nothing precise. (Imagine, on R, assigning a larger integral to functions such as f(x)=x^3.)
Is there something like this already in the literature? Also, you might have mentioned it, but it his obvious how first order and second order dominance are part of a 'ladder' of dominances?
Joscha Diehl said:
Paolo Perrone said:
Joscha Diehl said:
Paolo Perrone Thanks! Regarding third-order: no, I don't even know what third order dominance would mean. I'd hope it would pop out of the categorial view you have ..
My intuition would suggest, "the measure is more concentrated on lower values and more spread on higher values". But that's as much as I can say for now, nothing precise. (Imagine, on R, assigning a larger integral to functions such as f(x)=x^3.)
Is there something like this already in the literature? Also, you might have mentioned it, but it his obvious how first order and second order dominance are part of a 'ladder' of dominances?
@Joscha Diehl Yep, but I don't understand them very well: https://en.wikipedia.org/wiki/Stochastic_dominance#Third-order