Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: theory: categorical probability

Topic: traced Markov categories


view this post on Zulip Jules Hedges (May 09 2022 at 15:01):

Has anyone ever looked into Markov categories that are also traced monoidal, describing steady states of Markov chains?

view this post on Zulip Jules Hedges (May 09 2022 at 15:01):

This seems to me like a Thing That Ought To Be Done

view this post on Zulip Robin Piedeleu (May 09 2022 at 15:22):

@Mateo Torres-Ruiz and I were just wondering about this! A first (unsatisfactory) answer is that you can embed the category of stochastic maps into that of linear maps with the tensor as monoidal product, which is traced, at least in the finite-dimensional case---I'll let people that are knowledgeable in functional analysis tell us how to generalise this to infinite dimensions. But I suppose the (partial) trace of a stochastic map might not be stochastic, which is why I said that was unsatisfactory.

view this post on Zulip Robin Piedeleu (May 09 2022 at 15:24):

And this might not be the sort of trace you want if steady states of Markov chains is what you're studying. You'd prefer some additive (particle-style in GoI terms?) trace, computed as the sum of iterated powers of your stochastic matrix.

view this post on Zulip Tobias Fritz (May 10 2022 at 07:45):

Hmm, I'm pretty skeptical of traced Markov categories, since wouldn't a traced SMC whose monoidal unit is terminal be highly degenerate in some way? Like being the terminal category (up to equivalence)? I can't prove it right now, but this is probably what happens based on the intuition that the trace disconnects (towards the codomain) if the monoidal unit is terminal. (corrected below)

So maybe you rather want an abstract version of the category of substochastic matrices, similar to what Robin suggests. This looks a lot like you may want a traced gs-monoidal (2-)categories, though I'm not very familiar with these.

In either case I'm not aware of anyone having looked into it. With @Paolo Perrone and Sean Moss we have some preliminary results towards a theory of stochastic dynamical systems, but these go more in the direction of ergodic decomposition and ergodic theorems. But of course there may be things that I'm not aware of.

view this post on Zulip Robin Piedeleu (May 10 2022 at 08:29):

Tobias Fritz said:

Hmm, I'm pretty skeptical of traced Markov categories, since wouldn't a traced SMC whose monoidal unit is terminal be highly degenerate in some way?

Hmm... I don't think so. There are plenty of traced cartesian categories around (e.g. relations with the disjoint sum as monoidal product, which is a biproduct in this case) and their monoidal unit is terminal.

view this post on Zulip Tobias Fritz (May 10 2022 at 08:30):

Ah, right, thanks! I knew that at some point, but apparently lost trace of it.

view this post on Zulip Robin Piedeleu (May 10 2022 at 08:30):

I'm pretty sure you can't have a category that's both compact closed and Markov without significant degeneracy (probably a preorder).

view this post on Zulip Robin Piedeleu (May 10 2022 at 08:31):

But traced should be ok.

view this post on Zulip Tobias Fritz (May 10 2022 at 08:34):

Robin Piedeleu said:

I'm pretty sure you can't have a category that's both compact closed and Markov without significant degeneracy (probably a preorder).

Just equivalent to the terminal category: if C\mathbf{C} is compact closed and semicartesian, then every hom-set C(X,Y)C(XY,I)\mathbf{C}(X,Y) \cong \mathbf{C}(X \otimes Y^*, I) is a singleton.

view this post on Zulip Jules Hedges (May 10 2022 at 09:08):

Yeah, both traced cartesian and traced cocartesian not only exist but they're historically really important in program semantics

view this post on Zulip Jules Hedges (May 10 2022 at 09:13):

I think for example [sets, partial functions] can be equipped with traces for both its cartesian and cocartesian products

view this post on Zulip Tobias Fritz (May 10 2022 at 09:18):

Well then, I agree that e.g. studying traces on FinStoch looks like a Thing That Ought To Be Done :sweat_smile: Assuming that it hasn't been done already.

Do you think that Hasegawa's equivalence between traces and fixed point operators still holds for Markov categories? In other words, does his proof require the full universal property of products, or are weaker properties of copying and discarding enough?

view this post on Zulip Jules Hedges (May 10 2022 at 09:20):

Tobias Fritz said:

Just equivalent to the terminal category: if C\mathbf{C} is compact closed and semicartesian, then every hom-set C(X,Y)C(XY,I)\mathbf{C}(X,Y) \cong \mathbf{C}(X \otimes Y^*, I) is a singleton.

Incidentally this also implies that you can't have both compact closed + cartesian, and it's way simpler than the proof sketch I knew for that: namely, note that the unit η:IX×X\eta : I \to X \times X^* is ×\times-separable (as all cartesian states are), then use the yanking equation to prove that identities are \circ-separable, and then use that to contradict the identity axioms of a category

view this post on Zulip Jules Hedges (May 10 2022 at 09:22):

Tobias Fritz said:

Do you think that Hasagawa's equivalence between traces and fixed point operators still holds for Markov categories? In other words, does his proof require the full universal property of products, or are weaker properties of copying and discarding enough?

Good question... without looking I'd guess it most likely pushes a morphism through copying somewhere, but it's probably worth checking

view this post on Zulip Tobias Fritz (May 10 2022 at 09:32):

BTW this sounds like something that Prakash Panangaden must have thought about. I see something mentioned in Section 5.4 of Labelled Markov Processes, but don't have time now to look into further.

view this post on Zulip dusko (May 10 2022 at 13:59):

if it is true that markov categories are (symmetric?) monoidal with a commutative comonoid on each object, then such categories were studied in the 80s, in bialgebra cohomology, with partial traces in the representations. eg gerstenhaber had papers about that. even bob pare a couple, whom category theorists should know. see the references of tom fox's paper in JPAA where he shows that the category of coalgebras is cofree cartesian over monoidal. it's a fairly well-known paper. in the CS, categories with a comonoid on every object arose maybe in the early 90s in concurrency, from milner's action calculi. i think john power wrote some papers with milner about that...

what i don't understand is what andrej markov has to do with this structure :)

i am not asking in a negative way. i really don't understand. the fact that some markov chain phenomena involve copying and deleting does not mean that they are an essential feature. if they were, then we should be able to construct markov chains whenever the markov category structure is available. isn't any cartesian category a markov category? but the fact that the morphisms preserve copying makes everything deterministic. shouldn't we be able to present at least the unbiased coin in every markov category?

view this post on Zulip Jules Hedges (May 10 2022 at 14:04):

Markov categories are supposed to axiomatise categories whose morphisms are some kind of Markov kernel..... deterministic functions are ``Markov kernels for 2-valued probability''

view this post on Zulip dusko (May 10 2022 at 14:06):

2-valued probability is a bit of a ghost, isn't it? a 2-valued probability measure on a powerset is an ultrafilter. andrej markov did study pushkin's poetry, but it takes a lot of poetry to see a markov chain in an ultrafilter.

view this post on Zulip Jules Hedges (May 10 2022 at 14:33):

I think the ultimate answer is that quite a lot of probability theory can be done in a generic Markov category, although obviously not all of it. It seems like a sweet spot between how complicated the axioms are and how much you can get out of them

view this post on Zulip Tobias Fritz (May 10 2022 at 14:57):

dusko said:

if it is true that markov categories are (symmetric?) monoidal with a commutative comonoid on each object, then such categories were studied in the 80s, in bialgebra cohomology, with partial traces in the representations. eg gerstenhaber had papers about that. even bob pare a couple,

Interesting! The earliest references I know of so far are from the 90s from the term graph community, such as Gadducci's thesis. Do you have some concrete pointers to papers from the 80s? Looking at Bialgebra cohomology, deformations, and quantum groups by Gerstenhaber and Schack, I don't see mention of any general class of categories in there, let alone SMCs with a commutative comonoid on each object. I also can't find anything by Bob Paré on bialgebra cohomology.

i am not asking in a negative way. i really don't understand. the fact that some markov chain phenomena involve copying and deleting does not mean that they are an essential feature. if they were, then we should be able to construct markov chains whenever the markov category structure is available. isn't any cartesian category a markov category? but the fact that the morphisms preserve copying makes everything deterministic.

To add a bit to what Jules already said, I think that it is indeed quite surprising that copying and deleting are an essential feature of probability, and this is arguably why the topic has appeared only in the past few years. Here's a rough sketch of how to arrive at this structure from elementary probability.

First, note that product distributions, or equivalently stochastic independence play a crucial role in probability. Therefore it's sensible to axiomatize Markov kernels as an SMC. But for many statements of probability, this is not enough! Consider for example the equation p(x,y)=p(x)p(yx)p(x,y) = p(x) \cdot p(y|x), which one may take as the definition of the conditional probability p(yx)p(y|x). This equation has two occurrences of xx on the right-hand side, implying that it cannot be translated into pure SMC theory. This is where the comonoid structures come to the rescue.

On the other hand, one clearly cannot expect to have a cartesian monoidal category, since a joint distribution of two random variables is not determined by the individual distributions of either. In a sense, this is what probability theory is all about. Therefore one cannot expect the comultiplications to be natural. On the other hand, product projections are still natural transformations.

So we're arriving at the concept of SMCs with commutative comonoid structures, compatible with the monoidal structure, and such that projections are still natural. And that's (one formulation of) the definition of Markov category.

shouldn't we be able to present at least the unbiased coin in every markov category?

I would say no more or less than we "should" be able to present e.g. Cartesian coordinates in every cartesian category.

view this post on Zulip dusko (May 10 2022 at 17:57):

thank you for the nice explanation. i am quite impressed and curious about the way you and i guess lots of other people have been developing probability in categories. but since it seems to be a real movement, maybe it could escape the usual workflow of Working Mathematics for Category Theorists, and make itself really useful to real practitioners? categorical tools are genuinely useful, but the friendly reception within our own little community often prevents us from taking up the real challenges.

so let me try to be critical for a minute, not because i am critical, but just in case that it might be useful.

someone said that that the title of the project is "synthetic probability theory". if you look at synthetic differential geometry, or even hilbert's synthetic geometry, the idea is to supply a set of axioms sufficient to completely derive the essential parts of the theory, without borrowing ready-made parts from it. recognizing some necessary properties is nice, but if these properties are shared with confusingly different theories, we may confuse ourselves. in any case, finding tighter axioms prevents such confusions. (eg komogorov spent a great part of his life trying to provide a synthetic reconstruction of probability theory, better the one from his youth in that it would also engender a theory of information and statistics. the resulting theory of randomness is running strong, but all agree that it is in need of a language...)

Tobias Fritz said:

dusko said:

if it is true that markov categories are (symmetric?) monoidal with a commutative comonoid on each object, then such categories were studied in the 80s, in bialgebra cohomology, with partial traces in the representations. eg gerstenhaber had papers about that. even bob pare a couple,

Interesting! The earliest references I know of so far are from the 90s from the term graph community, such as Gadducci's thesis. Do you have some concrete pointers to papers from the 80s? Looking at Bialgebra cohomology, deformations, and quantum groups by Gerstenhaber and Schack, I don't see mention of any general class of categories in there, let alone SMCs with a commutative comonoid on each object. I also can't find anything by Bob Paré on bialgebra cohomology.

comonoids and the explanations of their meaning are there in the papers that you mention, and bob pare's paper is surely cited in tom fox's. but people were not so interested in categories as such, but in problems that get solved using them. if you want explicit analysis of comonoids as a categorical structure, then that would bring us from papers solving math problems to papers analyzing categorical structures. i am afraid that i cannot avoid mentioning
https://www.cambridge.org/core/journals/mathematical-structures-in-computer-science/article/abs/categorical-logic-of-names-and-abstraction-in-action-calculi/7C1B12F9B6EE6994E4EDB4CC88DE0E2C
maybe there are more references there. the whole thing goes back to lambek's observation that a comonoid, ie the comonad induced by it, ie the kleisli category of that comonad, is the categorical verson of freely adjoining a variable. an algebraic construct. plotkin also used such things in some of his lics papers, i think.

First, note that product distributions, or equivalently stochastic independence play a crucial role in probability. Therefore it's sensible to axiomatize Markov kernels as an SMC. But for many statements of probability, this is not enough! Consider for example the equation p(x,y)=p(x)p(yx)p(x,y) = p(x) \cdot p(y|x), which one may take as the definition of the conditional probability p(yx)p(y|x). This equation has two occurrences of xx on the right-hand side, implying that it cannot be translated into pure SMC theory. This is where the comonoid structures come to the rescue.

ah but the algebraic manipulations of the occurrences of xx and yy is what made people call them the random variables. a variable is a thing that can be copied and deleted. the fact that it needs to be copied when we reason about conditional probability is not a property of conditional probability, but a feature of algebra of random variables. edward nelson reconstructed the entire "radically elementary probability theory" from that. but lambek also switched between variables and comonoid/comonads in all kinds of contexts, from rings and modules to deductive systems and cartesian categories. (see either of his books on these subjects.)

bob coecke and i used the comonoid structures to recognize classical states in quantum universes because the comonoid structure allows copying and deletion, which the quantum states do not permit:
https://arxiv.org/abs/0904.1997
uh sorry i don't mean to spam the world by advertising my work but it occurs to me now that in that paper we distinguished not only classical/quantum processes using the comonoids, but also stochastic, processes, precisely with your totality requirement. this didn't occur to me until now. but that was just peeling peaces of structure that happen to be there, with no hope to really characterize stochasticity in such terms...

shouldn't we be able to present at least the unbiased coin in every markov category?

I would say no more or less than we "should" be able to present e.g. Cartesian coordinates in every cartesian category.

well, the cartesian coordinates got descartes' name because he started denoting abstract points by tuples of variables. the cartesian categories got descartes name from cartesian products, whose elements are the tuples that can be referred to by variables. the elements of a tensor product cannot be denoted by such coordinates. the way cartesian structure allows manipulating abstract cartesian coordinates was worked out in early semantics of computation, cf lambek-scott's book...

sorry about the flood of words. it is because i really like what you guys have been doing. it would be even better if you would do more :)

view this post on Zulip dusko (May 11 2022 at 02:19):

QUESTION: take an arbitrary commutative monad T:SetSetT:Set\to Set. Let MaT=Kl(T)/1{\cal Ma}_T = Kl(T)/1 be the slice over 11 of the Kleisli category Kl(T)Kl(T). the cartesian structure of SetSet induces comonoids a monoidal structure on MaT{\cal Ma}_T, with a comonoid on every object XX, whose counit is the terminal arrow X1X\to 1. This is a Markov category, isn't it? If that structure supports probability theory, we have a central limit theorem, a family of poisson processes, a law of small numbers. Can we really reason probabilistically under any commutative monad on SetSet?

view this post on Zulip JS PL (he/him) (May 11 2022 at 05:37):

Jules Hedges said:

I think for example [sets, partial functions] can be equipped with traces for both its cartesian and cocartesian products

The category of sets and partial functions, which I'll call PAR\mathsf{PAR}, with tensor = coproduct is indeed a traced (cocartesian) monoidal category. The trace is described on page 3 of https://www.mathstat.dal.ca/mfps2018/preproc/paper_8.pdf (in that paper they also talk about partially defined trace operators)

view this post on Zulip JS PL (he/him) (May 11 2022 at 05:51):

On the other hand, I don't know of a trace for PAR\mathsf{PAR} with respect to the Cartesian product (which is worth mentioning that it is not a categorical product but rather a restriction product in the sense of restriction categories https://arxiv.org/abs/math/0610500).

view this post on Zulip JS PL (he/him) (May 11 2022 at 05:55):

Robin Piedeleu said:

I'm pretty sure you can't have a category that's both compact closed and Markov without significant degeneracy (probably a preorder).

Out of curiosity: what about Markov and *-autonomous?

view this post on Zulip JS PL (he/him) (May 11 2022 at 05:56):

Tobias Fritz said:

Do you think that Hasegawa's equivalence between traces and fixed point operators still holds for Markov categories? In other words, does his proof require the full universal property of products, or are weaker properties of copying and discarding enough?

Well I'm currently Hasegawa's postdoc and we're working on traced monoidal categories stuff. I could ask him what he thinks!

view this post on Zulip Tobias Fritz (May 11 2022 at 07:09):

Thanks for the kind words @dusko ! It's always interesting to hear your thoughts, given your unusual depth of insight and breadth of knowledge.
dusko said:

i am quite impressed and curious about the way you and i guess lots of other people have been developing probability in categories. but since it seems to be a real movement, maybe it could escape the usual workflow of Working Mathematics for Category Theorists, and make itself really useful to real practitioners? categorical tools are genuinely useful, but the friendly reception within our own little community often prevents us from taking up the real challenges.

Right, completely agreed! I guess everyone has their own attitude and plans in this respect, so I can only speak for myself: I'm personally interested mainly in fundamental things, or in other words I'm more of a philosopher than a practitioner. So I won't take up the real challenges myself. But fortunately there are many others here who work on real challenges with ACT. With Markov categories specifically, for example the people at Planting.Space seem to find the theory valuable and use it at the theoretical end of things.

Edit: if traditional probability theorists also count as "practitioners", then I'm even personally interested and involved in making the formalism useful to them. For example, our paper on de Finetti has appeared in a stochastic analysis journal.

someone said that that the title of the project is "synthetic probability theory". if you look at synthetic differential geometry, or even hilbert's synthetic geometry, the idea is to supply a set of axioms sufficient to completely derive the essential parts of the theory, without borrowing ready-made parts from it. recognizing some necessary properties is nice, but if these properties are shared with confusingly different theories, we may confuse ourselves. in any case, finding tighter axioms prevents such confusions.

Indeed. For now we may still be quite far from a complete set of axioms, but that is one of the big goals of the theory. I like to compare it with abstract algebra as a synthetic theory of number systems: we can characterize R\mathbb{R} as the unique complete Archimedean ordered field, but there's also a lot of other stuff in field theory and commutative algebra that's interesting, and plenty of theorems that apply at various levels of generality. So for me, synthetic probability is more of a (partially ordered) hierarchy of theories; let me say a bit more on that below. On the other hand, there's also Alex Simpson's synthetic probability theory, which is not directly related to Markov categories and intended to be complete (at least morally).

On the references discussion:

comonoids and the explanations of their meaning are there in the papers that you mention, and bob pare's paper is surely cited in tom fox's.

Sorry about being dense, but I'm still not sure what you're referring to. Is it Thomas Fox, Coalgebras and cartesian categories? That is from 1976 (and not JPAA), so it can't cite bialgebra cohomology papers from the 80s, and there's no citation to Paré either. So maybe you have a different Tom Fox paper in mind? Do you have a link?

if you want explicit analysis of comonoids as a categorical structure, then that would bring us from papers solving math problems to papers analyzing categorical structures. i am afraid that i cannot avoid mentioning
https://www.cambridge.org/core/journals/mathematical-structures-in-computer-science/article/abs/categorical-logic-of-names-and-abstraction-in-action-calculi/7C1B12F9B6EE6994E4EDB4CC88DE0E2C

Interesting! The definition that you've given there,
image.png
looks somewhat related to the definition of Markov category. Now I don't know anything about action calculi, which makes the paper a bit difficult for me to follow, and I haven't really grasped the admissibility condition yet. So let me ask like here: Your definition doesn't seem to require that the comonoid structure on a tensor product must be the tensor product comonoid structure, although this is a natural and important condition. Or can this be obtained from your admissibility somehow? If so, for what choice of K\mathcal{K}?

ah but the algebraic manipulations of the occurrences of xx and yy is what made people call them the random variables. a variable is a thing that can be copied and deleted. the fact that it needs to be copied when we reason about conditional probability is not a property of conditional probability,

Yes, it is a property of conditional probability or conditioning, in the sense that copying of variables is necessary to express it. Not so for other concepts in probability, such as independence, where we can simply write p(x,y)=p(x)p(y)p(x,y) = p(x) p(y) and no copying of variables occurs.

but a feature of algebra of random variables.

Sure, but I don't see that as being substantially different, since the multiplication in an algebra is just the Heisenberg picture or predicate transformer version of the duplication of variables. It's the comonoid structures that are important at the synthetic level, and this can manifest itself semantically either as an actualy copying or diagonal map, or as the formal dual of a multiplication or monoid structure.

bob coecke and i used the comonoid structures to recognize classical states in quantum universes because the comonoid structure allows copying and deletion, which the quantum states do not permit:
https://arxiv.org/abs/0904.1997

Right. However, the Frobenius structures used in that paper and others by Bob are actually not suitable for measure-theoretic probability, since there it's really only the comonoid structures that exist, but no multiplications and no units. And surely any formalism laying claim to words like "synthetic probability theory" must have a way of accounting for measure theory (even if that is by arguing it away).

I've been simplifying a little bit, since one can at least have multiplications also in measure-theoretic probability (resulting in non-unital Frobenius structures) if one is willing to drop normalization of probability, but for synthetic probability this seems to be the less useful approach (and I could elaborate if anyone wants more details).

view this post on Zulip Tobias Fritz (May 11 2022 at 07:36):

dusko said:

QUESTION: take an arbitrary commutative monad T:SetSetT:Set\to Set. Let MaT=Kl(T)/1{\cal Ma}_T = Kl(T)/1 be the slice over 11 of the Kleisli category Kl(T)Kl(T). the cartesian structure of SetSet induces comonoids a monoidal structure on MaT{\cal Ma}_T, with a comonoid on every object XX, whose counit is the terminal arrow X1X\to 1. This is a Markov category, isn't it?

Not quite, since the comultiplications will generally not be morphisms in the slice category. But if you restrict yourself to those objects for which this is the case, then yes, you get a Markov category. The most relevant case of this construction is when T(1)1T(1) \cong 1 (TT is affine), since then it's just the Kleisli category itself that is a Markov category, and this works whenever TT lives on a cartesian category. We call Markov categories that arise like this (in a nice way) representable.

If that structure supports probability theory, we have a central limit theorem, a family of poisson processes, a law of small numbers. Can we really reason probabilistically under any commutative monad on SetSet?

No, most constructions and theorems of probability theory will require additional hypotheses to be applicable. For example, not every Markov category has conditionals, and the existence of conditionals is a crucial hypothesis in our proof of the de Finetti theorem. In fact, the only nontrivial Markov categories that we know of and satisfy all the relevant hypotheses are those of measure-theoretic probability. For zero-one laws, one can get by with weaker hypotheses which make those results also apply to suitable categories of relations. So here we have a hierarchy of relevant assumptions, and I think of it as vaguely analogous to assumptions one makes on rings and fields in algebra, all the way from general rings to (for example) complete Archimedean ordered fields.

The central limit theorem and things like that are big words, and getting a good categorical treatment of more quantitative things like that is very challenging. We have some preliminary results on the law of large numbers and some basics of ergodic theory with Markov categories, including a version of ergodic decomposition. But these are again qualitative things that are easier to do.


What I gather from your comments overall is a certain skepticism towards using the term "synthetic probability theory" for what we're doing. That's a valid criticism, and we should perhaps be careful with using that term at the present stage where we can only reproduce a small number of qualitative results from probability and statistics, and we don't yet have a detailed idea of how to approach more quantitative theorems.

view this post on Zulip JS PL (he/him) (May 11 2022 at 09:33):

Jules Hedges said:

Tobias Fritz said:

Do you think that Hasagawa's equivalence between traces and fixed point operators still holds for Markov categories? In other words, does his proof require the full universal property of products, or are weaker properties of copying and discarding enough?

Good question... without looking I'd guess it most likely pushes a morphism through copying somewhere, but it's probably worth checking

So I asked Hassei (Hasegawa), and he says that yes naturality of the copy map is essential for the equivalence. That said, he says that in some cases, you can still obtain a fixed point operator from a trace even if the copy map is unnatural.

view this post on Zulip JS PL (he/him) (May 11 2022 at 09:34):

So for traced Markov categories, it does not seem like the trace would equivalent to a sort of fixed point operator.

view this post on Zulip Jules Hedges (May 11 2022 at 09:47):

I can only speak for myself as a "consumer" of categorical probability theory - I've found that Markov categories are a really nice setting for open games. You need a supply of comonoids for intuitive game-theoretic reasons, for example because some move needs to be known by both subsequent players and the payoff function. And the I=1I = 1 axiom turns out to be useful for technical reasons, it makes some boring complications that appear in the general monoidal case go away

view this post on Zulip Jules Hedges (May 11 2022 at 09:48):

Whether this counts as "practical" depends on what you think about the open games project, but we're trying (although the practical end is all code, and we've never attempted to write polymorphic code in Markov categories terms, they stay in back-of-shop)

view this post on Zulip Jules Hedges (May 11 2022 at 10:39):

JS Pacaud Lemay (he/him) said:

So for traced Markov categories, it does not seem like the trace would equivalent to a sort of fixed point operator.

I also twote the question about traced Markov categories, and Davidad replied pointing out that Markov chains don't have unique steady states, so you'd have to do pick a canonical one like "trace = maximum entropy steady state" (for example, so the trace of the identity map 222 \to 2 would give a uniform distribution). So I'd vaguely conjecture that this gives an example of a traced Markov category

view this post on Zulip Tobias Fritz (May 11 2022 at 11:21):

So how exactly would you define the trace of a general stochastic map AXAYA \otimes X \to A \otimes Y between finite sets? I think that it'll be difficult to get the naturality in YY to work.

view this post on Zulip Jules Hedges (May 11 2022 at 11:24):

I guess the general idea is TrAf(x)=f(a,x)2\mathrm{Tr}_A f (x) = f (a, x)_2, where aa is a steady state of λa.f(a,x)1:AA\lambda a . f (a, x)_1 : A \to A. If it fails to be natural then I guess the idea just doesn't work

view this post on Zulip Sam Staton (May 11 2022 at 18:48):

Hi, I think traces/recursion are interesting and we’ve thought about it a bit. For a simple example we might consider the trace of the map I2ω2ω2ωI\otimes 2^\omega\to 2^\omega\otimes 2^\omega
given by
I2ω(bernoulliid)22ωcons2ωcopy2ω2ωI\otimes 2^\omega \xrightarrow{(bernoulli \otimes id)} 2 \otimes 2^\omega \xrightarrow{cons} 2^\omega \xrightarrow{copy} 2^\omega \otimes 2^\omega
where 2ω2^\omega is an object of infinite sequences of booleans, and (cons) adds something to the front of an infinite sequence.
The trace gives us a map I2ωI \to 2^\omega, i.e. a random infinite sequence.
The “naturality" axiom of traced monoidal categories says that you can pull the Bernoulli to the outside. So this random sequence could be the random choice between an infinite stream of heads and the infinite stream of tails, and it shouldn't be an infinite stream of coin flips.

We made a Haskell library, lazyppl, to experiment with categorical probability. And we explored various kinds of recursion there… I can define a trace operator and define this random sequence using it. (We do also have infinite streams of coin flips in lazyppl.) The underlying category I have in mind is pointed wcpos or wQBSs, so when I wrote 2ω2^\omega above I really meant 2ω2^\omega_\bot. A connection between traces and "mfix" is in the traced premonoidal categories paper.

view this post on Zulip Chris Barrett (May 12 2022 at 08:43):

Tobias Fritz said:

I've been simplifying a little bit, since one can at least have multiplications also in measure-theoretic probability (resulting in non-unital Frobenius structures) if one is willing to drop normalization of probability, but for synthetic probability this seems to be the less useful approach (and I could elaborate if anyone wants more details).

I'm curious to hear more about this. Is it possible to drop normalization, but keep a requirement of sub-normalization? It always struck me that Frobenius multiplication gives a "conditioning" operation, and so "ought to" fit naturally into a categorical account of probability theory, somehow.

In some cases, the ability to form cycles is responsible for the loss of normalization, and I've wondered if moving to *-autonomous categories might be an improvement in this case (restricting how one can compose diagrams in order to maintain (sub-)normalization), and indeed I believe one can still form a reasonable generalized definition of "Frobenius algebra on every object" in this setting. I'd be curious to hear any thoughts you might have around any of this!

view this post on Zulip Tobias Fritz (May 12 2022 at 09:23):

Chris Barrett said:

I'm curious to hear more about this. Is it possible to drop normalization, but keep a requirement of sub-normalization? It always struck me that Frobenius multiplication gives a "conditioning" operation, and so "ought to" fit naturally into a categorical account of probability theory, somehow.

That is indeed intuitive, but the main trouble with the idea is in measure-theoretic probability, where individual outcomes all typically have probability zero. This runs into a couple of related issues, one being that for continuous variables, the subnormalized version of conditioning typically gives identically 00, while the usual "normalized" conditioning still makes sense and is useful (see regular conditional probability). Another is that Frobenius monoids are typically finite(dimensional) or compact in some way, and it's difficult to imagine a definition of morphism that would achieve that for the kinds of structures that appear in measure theory (or even potential future replacements of measure theory). A third point is that having normalization is really important for distributions on infinite products, or equivalently situations with infinitely many random variables. If you have infinitely many independent coin flips, and each of them is merely subnormalized, then the total normalization would again be zero.

Perhaps one could get around all of this e.g. by allowing infinitesimal probabilities, so my point is just to indicate that it's difficult to deal with infinite things like this in a subnormalization setting. And presumably we still want to be able to model situations with infinitely many continuous variables, at least as convenient idealizations.

Perhaps more importantly, it also turns out that there is no need for anything more than a comonoid structure, since the latter is already sufficiently expressive for a lot of things (like conditionals), and I can't think of any general construction or theorem in probability where one would require something like a Frobenius structure in order to state it. So even if one could use Frobenius structures, one can argue that it would be overkill.

As for *-autonomous categories, I don't see why one would expect those to be relevant. What would be a candidate object that we would expect to not be isomorphic to its own dual? (If I understand correctly, if all objects are self-dual, then we're back in the compact closed case, right?) For example, we would want to have an object that corresponds to the space of outcomes of countably many coin flips, something like {0,1}N\{0,1\}^{\mathbb{N}}. Would you expect this object to be self-dual or no? If not, how should I think about its dual?

view this post on Zulip Jules Hedges (May 12 2022 at 10:35):

Going into good solid scientific wild speculation, my first thought was that if XX is "the type of random processes/random variables of type XX" then XX^* ought to be "the type of events/measurements of type XX" (although we can expect to run into the same normalisation issues, probably)

view this post on Zulip Chris Barrett (May 12 2022 at 11:17):

Thank you for the response! It seems that -- as one might expect -- working in infinite dimensions causes issues for normalization and in particular for the Frobenius structure. This sounds a familiar issue in, e.g., categorical quantum mechanics -- although, there they do need the extra structure, whereas you're right to point out that you don't seem to!

I will quickly add that I'm not an expert in this area, especially in the functional analysis/measure theory necessary to work properly with infinite dimensions. I come from a program semantics background, but I've been following developments in ACT keenly. One motivation I have to consider Frobenius multiplication is the following: my upcoming PhD thesis devlops a variant of the lambda calculus which (among other remarkable things) constitutes a term language for string diagrams, characterizing (by various first-order fragments) the free Cartesian category, SMC, Markov category, Hypergraph category and the free CDU category. Here, Frobenius multiplication (in the form of equality checking), surprisingly, appears to arise very naturally in the syntax, seeming to suggest to me (if you take the language seriously) that this is a natural structure for computing with, (for example, implementing exact conditioning in a PPL). This leads me to want such a structure in my categorical semantics as well. However, I have not developed any of this with infinite dimensional semantics in mind, and there no doubt would be problems arising in doing so (not least, the ones you mention). Sorry if some of this is a bit cryptic: at some point I hope to write all this up properly for discussion here, but I don't want to derail this thread.

Regarding the *-autonomous structure, there is some work on using *-autonomous categories to extend results from CQM to infinite dimensions (see Srinivasan's thesis on "Dagger Linear Logic and Categorical Quantum Mechanics), where they additionally generalize the definition of Frobenius algebra to a *-autonomous setting. Similar things have been done before, where AAAA \otimes A \to A and AAAA \to A \oplus A are the types of the Frobenius monoid and comonoid, which can then interact via the linear distributive law. This might serve as a direction in which to answer to your second point.

Regarding duality, in the field of program semantics, there is an important model of higher-order probabilistic programming called probabilistic coherence spaces. These form a *-autonomous category, and thus have a notion of duality. If we think of XX as being some set of states 1X1 \multimap X, then XX^\bot can be thought of as the set of states X1X \multimap 1 which "(sub-)normalize" every state of XX. Objects are then subsets such that X=XX = X^{\bot\bot}. Consider atomic types here as being the type of sub-normalized states (of possibly countable dimension), closed under double dual. First-order processes ABA \multimap B are the type of processes (matrices) taking (sub-)normalized states of AA to (sub-)normalized states of BB. And for higher-order processes, an example: (AB)C(A \multimap B) \multimap C is the type of processes (matrices) taking first-order processes of type ABA \multimap B to (sub-)normalized states of CC. Note this is _not_ degenerate: it is not a compact closed category.

Just to reiterate, I don't consider myself an expert here, I'm just trying to give some context and motivation for my naive questions. Obviously, I don't claim any of the above remarks solve the issues with infinite dimensions, nor do I intend to directly contradict anything you said! I hope none of this is derailing the topic too much.

view this post on Zulip Tobias Fritz (May 12 2022 at 12:06):

Jules Hedges said:

Going into good solid scientific wild speculation, my first thought was that if XX is "the type of random processes/random variables of type XX" then XX^* ought to be "the type of events/measurements of type XX" (although we can expect to run into the same normalisation issues, probably)

That was my first thought as well, but then we seem to be working in a different categorical framework, in the sense that the morphisms are no longer interpreted as Markov kernels. And so then I wouldn't know how to think of a morphism in general. Any ideas?

view this post on Zulip Tobias Fritz (May 12 2022 at 12:32):

@Chris Barrett, that makes good sense. I guess it's important to distinguish two separate issues concerning the possibility of having a Frobenius structure (leaving aside the desirability question):

I understand that there may be situations where one really wants to have an equality predicate, and therefore the multiplication. In such a context, it's certainly feasible to move to subnormalization even in measure-theoretic probability, and I think that one will then have non-unital Frobenius structures. (There's no duality since there are no cups, so no problems with infinite-dimensionality.) By what I said above, I think the price to pay is that one probably cannot expect conditioning to make sense, and infinite tensor products or having infinitely many random variables is not going to work either. But I can imagine that these costs are worth the benefit in some applications.

Concerning units on the other hand, maybe you don't actually want those? They seem quite unnatural from the measure theory perspective: there is nothing like a distinguished "uniform distribution" on most infinite measurable spaces, and even in the finite setting I would say that their relevance for a foundational framework of probability is questionable at best.

I've heard of probabilistic coherence spaces, but unfortunately linear logic has always been obscure to me... Very vaguely, is it fair to say that the category of probabilistic coherence spaces is something like (closed) convex cones in (topological) vector spaces, and morphisms certain positive linear maps? If so, then that sounds very much like the category of what quantum physicists would call generalized probabilistic theories. This sort of thing seems to be of interest in the foundations of quantum mechanics, but I think that it's not so useful in the context of classical probability since the comultiplications or copy morphisms no longer exist.

view this post on Zulip Morgan Rogers (he/him) (May 13 2022 at 07:31):

Concerning units on the other hand, maybe you don't actually want those? They seem quite unnatural from the measure theory perspective: there is nothing like a distinguished "uniform distribution" on most infinite measurable spaces, and even in the finite setting I would say that their relevance for a foundational framework of probability is questionable at best.

This claim is wild. Maybe you can't expect to have such a thing on a maximal category of measurable spaces, but spaces which do admit uniform distributions are far more prevalent and useful than wild measure spaces that don't, at least in my limited experience of probability theory, so you should definitely not be dismissing categories of such things.

view this post on Zulip Tobias Fritz (May 13 2022 at 08:19):

I don't think that I've been saying anything particularly surprising or controversial, so let me try to clarify. First of all, I'm not particularly interested in "wild" measurable spaces: it's questionable whether measure theory in its usual form is appropriate for "large" spaces, and there are ongoing attempts at finding a better theory. It's perfectly fine if we restrict ourselves to the standard Borel spaces, which are those that arise from separable and complete metric spaces. There everything works nicely.

Now let's get into what we mean by a "uniform distribution". In my earlier post, the notion that I had in mind would be something like "a distinguished morphism uX:IXu_X : I \to X for every object XX, such that these are preserved by isomorphisms and multiplicative under tensor". But already isomorphism invariance is too strong: the usual "uniform distribution" on [0,1][0,1] isn't invariant under the automorphism xx2x \mapsto x^2. With a bit more work, one can prove that an automorphism-invariant probability measure on [0,1][0,1] doesn't exist.

I don't deny that people talk about a "uniform distribution" on [0,1][0,1] and that this makes good sense. However, this refers to an incarnation of [0,1][0,1] in a category where the objects have additional structure. So I think that the term makes sense whenever one has extra structure which induces a distinguished (finite) reference measure, such as a Riemannian metric with finite volume. But such additional structure is not needed in the foundations of probability: the amazing thing is that the mere structure of (something like) a measurable space is sufficient for many of its constructions and theorems. So basic probability enjoys much stronger invariance properties than these "uniform distributions" do, and presumably we want to preserve that.

Do you still disagree? If so, what do you mean by "uniform distribution"?

view this post on Zulip Jules Hedges (May 13 2022 at 09:52):

Morgan Rogers (he/him) said:

spaces which do admit uniform distributions are far more prevalent and useful than wild measure spaces that don't

I must also point out that R\mathbb R with its standard measure doesn't admit a uniform distribution

view this post on Zulip Nathaniel Virgo (May 13 2022 at 10:09):

Or N\mathbb{N} for that matter

view this post on Zulip Morgan Rogers (he/him) (May 13 2022 at 17:41):

Tobias Fritz said:

I don't deny that people talk about a "uniform distribution" on [0,1][0,1] and that this makes good sense. However, this refers to an incarnation of [0,1][0,1] in a category where the objects have additional structure. So I think that the term makes sense whenever one has extra structure which induces a distinguished (finite) reference measure, such as a Riemannian metric with finite volume. But such additional structure is not needed in the foundations of probability: the amazing thing is that the mere structure of (something like) a measurable space is sufficient for many of its constructions and theorems. So basic probability enjoys much stronger invariance properties than these "uniform distributions" do, and presumably we want to preserve that.

Do you still disagree? If so, what do you mean by "uniform distribution"?

I'm on board with seeing what you can do without extra assumptions, but it seemed to me you were asserting that the extra structure wasn't worth considering, when there is clearly plenty of probability theory (or applications thereof) that do involve uniform distributions. I didn't really understand your counterexample for [0,1][0,1]; in what category are you working for the morphism IXI \to X to be a distribution while xx2x \mapsto x^2 to be an automorphism on [0,1][0,1] (did you mean for the mapping to be equipped with some extra info?).

The specific example I shall mention (although I've failed for the past half hour to find a nice probability monad whose Kleisli category it can live in) is to consider compact, locally compact topological groups. These have a distinguished uniform probability measure, the Haar measure. I remember my representation theory lecturer back in undergrad explaining that this is what is used to extend results from representation theory for finite groups to compact groups, which is at least modestly important. Of course, I could be mistaken in supposing that these will fit neatly into the Markov category formalism, because there are enough gadgets flying around in this topic that there are likely compatibility conditions that I'm ignorant of, but maybe the suggestion could be motivating in its own right?

view this post on Zulip Tobias Fritz (May 13 2022 at 19:28):

Morgan Rogers (he/him) said:

I'm on board with seeing what you can do without extra assumptions, but it seemed to me you were asserting that the extra structure wasn't worth considering, when there is clearly plenty of probability theory (or applications thereof) that do involve uniform distributions.

Right. My statements are all to be understood as living in a specific (and quite foundational) context in which those extra structures are not relevant. And that covers quite a lot of ground already!

I didn't really understand your counterexample for [0,1][0,1]; in what category are you working for the morphism IXI \to X to be a distribution while xx2x \mapsto x^2 to be an automorphism on [0,1][0,1] (did you mean for the mapping to be equipped with some extra info?).

Just the Kleisli category of the Giry monad on (say standard Borel) measurable spaces. So morphisms IXI \to X are probability measures on XX, and through the Kleisli adjunction every measurable map is still a morphism in here. Considering [0,1][0,1] as a measurable space with its Borel σ\sigma-algebra as usual, the measurable map xx2x \mapsto x^2 has measurable inverse xxx \mapsto \sqrt{x} and is therefore an automorphism.

The specific example I shall mention (although I've failed for the past half hour to find a nice probability monad whose Kleisli category it can live in) is to consider compact, locally compact topological groups. These have a distinguished uniform probability measure, the Haar measure. I remember my representation theory lecturer back in undergrad explaining that this is what is used to extend results from representation theory for finite groups to compact groups, which is at least modestly important. Of course, I could be mistaken in supposing that these will fit neatly into the Markov category formalism, because there are enough gadgets flying around in this topic that there are likely compatibility conditions that I'm ignorant of, but maybe the suggestion could be motivating in its own right?

Yes, absolutely! It goes a bit beyond the foundational setting that I've been referring to, but it's clearly very important.

As for a categorical framework for this, here's one attempt. The category of compact Hausdorff spaces carries a probability monad called the [[Radon monad]] RR. It follows that Kl(R)Kl(R) is a Markov category. Its morphisms are continuous Markov kernels, and its subcategory of deterministic morphisms is exactly the usual CHaus\mathsf{CHaus}. Your group GG can now be interpreted as a group object in this category (since the local compactness is trivially true by compactness). Hence we can talk about the category of internal GG-spaces in Kl(R)Kl(R), which is again a Markov category, call it Kl(R)GKl(R)^G. I believe that the existence and uniqueness of the Haar measure amount to saying that when taking GG to act on itself (from either side) and considering it as an object in Kl(R)GKl(R)^G, then there is a unique morphism IGI \to G. Right?

If I remember correctly, we got into the discussion about uniform distributions by talking about the relevant categorical structure for probability, where I had claimed that one doesn't want objects to have distinguished units. I think that the categorical formulation of the previous paragraph illustrates my point: the object GG does have a distinguished unit, but I wouldn't know what the distinguished unit of a general object should be.

view this post on Zulip Chris Barrett (May 14 2022 at 07:16):

Tobias Fritz said:

I've heard of probabilistic coherence spaces, but unfortunately linear logic has always been obscure to me... Very vaguely, is it fair to say that the category of probabilistic coherence spaces is something like (closed) convex cones in (topological) vector spaces, and morphisms certain positive linear maps? If so, then that sounds very much like the category of what quantum physicists would call generalized probabilistic theories. This sort of thing seems to be of interest in the foundations of quantum mechanics, but I think that it's not so useful in the context of classical probability since the comultiplications or copy morphisms no longer exist.

Indeed it would seem that you're roughly correct about PCS: https://arxiv.org/pdf/1803.06005.pdf

view this post on Zulip Tobias Fritz (May 14 2022 at 07:49):

Oh, nice! This looks like it could actually be helpful for someone like me to understand linear logic :sweat_smile:

view this post on Zulip Morgan Rogers (he/him) (May 15 2022 at 11:04):

Tobias Fritz said:

I didn't really understand your counterexample for [0,1][0,1]; in what category are you working for the morphism IXI \to X to be a distribution while xx2x \mapsto x^2 to be an automorphism on [0,1][0,1] (did you mean for the mapping to be equipped with some extra info?).

Just the Kleisli category of the Giry monad on (say standard Borel) measurable spaces. So morphisms IXI \to X are probability measures on XX, and through the Kleisli adjunction every measurable map is still a morphism in here. Considering [0,1][0,1] as a measurable space with its Borel σ\sigma-algebra as usual, the measurable map xx2x \mapsto x^2 has measurable inverse xxx \mapsto \sqrt{x} and is therefore an automorphism.

Ah so the endomorphism is "really" xδx2x \mapsto \delta_{x^2}. I did the computation of the composition in the Kleisli category, where we indeed get p(U)=unifX({xx2U})p(U) = \mathrm{unif}_X(\{x \mid x^2 \in U\}), as you alluded to.

The category of compact Hausdorff spaces carries a probability monad called the [[Radon monad]] RR. It follows that Kl(R)Kl(R) is a Markov category. Its morphisms are continuous Markov kernels, and its subcategory of deterministic morphisms is exactly the usual CHaus\mathsf{CHaus}. Your group GG can now be interpreted as a group object in this category (since the local compactness is trivially true by compactness). Hence we can talk about the category of internal GG-spaces in Kl(R)Kl(R), which is again a Markov category, call it Kl(R)GKl(R)^G. I believe that the existence and uniqueness of the Haar measure amount to saying that when taking GG to act on itself (from either side) and considering it as an object in Kl(R)GKl(R)^G, then there is a unique morphism IGI \to G. Right?

Almost; in general the left and right Haar measures are different; there's an example further down the Wikipedia entry, but considering the left and right actions independently is fine. Thanks for putting this together, I'm very pleased to discover the potential for such nice interactions between the construction of the category of internal groups and the Kleisli category (I haven't much experience thinking about Markov categories, but being able to build them in many different ways is exciting from the point of view of eventually being able to apply your results in an exotic setting without too much effort).

If I remember correctly, we got into the discussion about uniform distributions by talking about the relevant categorical structure for probability, where I had claimed that one doesn't want objects to have distinguished units. I think that the categorical formulation of the previous paragraph illustrates my point: the object GG does have a distinguished unit, but I wouldn't know what the distinguished unit of a general object should be.

I see what you mean. I think this became a debate because I was reading the OP question as asking "what would happen if a Markov category had structure XYZ?" (a pretty reasonable question) rather than "should/do Markov categories come equipped with structure XYZ (for the purposes of synthetic probability theory)?" which you have made a solid argument against. Thanks for taking the time to address my concerns!

view this post on Zulip Jules Hedges (May 15 2022 at 12:26):

Oh, now I can go back and make sense of a lot of the discussion in context. If you have a traced Markov category then every object XX admits a canonoical state TrXΔX:IX\mathrm{Tr}_X \Delta_X : I \to X, which can be reasonably expected to behave like a uniform / max entropy distribution at least in "reasonable" examples. That helps for understanding which Markov categories can be expected to admit a trace, and which definitely shouldn't

view this post on Zulip Jules Hedges (May 15 2022 at 12:30):

You can pull the same trick in any traced cartesian category. In a category of computable functions (or a model of computable functions) where trace is non-well-founded recursion, the point TrXΔX:1X\mathrm{Tr}_X \Delta_X : 1 \to X is the nonterminating value of XX, which is the state about which you have the least knowledge

view this post on Zulip Jules Hedges (May 15 2022 at 12:32):

Aka the thing corresponding to the Haskell code let x = x in x, which loops forever if you evaluate it

view this post on Zulip Sam Staton (May 17 2022 at 20:39):

Hi @Jules Hedges, do you have a traced Markov category that does this? Would be very interesting. I am looking at the traced premonoidal paper again. Proposition 3.2.1 seems to specialize to Markov categories to give a correspondence between trace operators on Markov categories and a special kind of fixed-point operator (Def 3.3). This is a generalization of Hasegawa/Hyland's characterization for cartesian cats. Now for the kind of trace you are talking about, with trace=uniform distribution, from that perspective, I'm worried about the "diagonal property" (Def 3.3(3))...

view this post on Zulip Jules Hedges (May 18 2022 at 09:51):

I don't, but nor have I tried, it's just an initial idea for a construction. Failure is definitely an option

view this post on Zulip Sam Staton (May 19 2022 at 05:25):

It’s interesting. I conjecture that there’s also a non-trivial traced Markov category that just does the mundane Haskell-like thing. But that's not inconsistent with your idea -- there could be very different traces around.

view this post on Zulip Sam Staton (May 19 2022 at 05:36):

Also, speculating on the topic of uniform distributions. I sort-of hope that “synthetic probability” might help us understand improper priors, in the same way that SDG helps us understand infinitessimals. For example, Dario's Markov category GaussEx (here) does have a uniform distribution on all of R\mathbb R. (Or rather, it has a morphism that could be thought of like that.)

view this post on Zulip Dario Stein (Dec 20 2023 at 12:57):

Interesting thread! I've just started looking into traces because they seem relevant to comb factorizations as used here. In order for comb application to be well-defined, we want an implicational axiom like the following to hold
combs.jpeg
This axiom can be proven in every Markov category that embeds faithfully in a compact closed category (as in the paper), or even in a traced category.

Now, the axiom is in fact invalid when we have atomless distributions around. Take α\alpha to be the uniform distribution on [0,1][0,1], and let γ1(x,y)=[x=y]\gamma_1(x,y) = [x=y] and γ2(x,y)=0\gamma_2(x,y) = 0 (Iverson brackets). Then the left hand sides are equal, but the RHS are not (take f=idf=\mathsf{id}). This shows that a whole host of Markov categories cannot be embedded faithfully in any traced category, for example BorelStoch, Stoch, or nominal sets with name generation.

Tagging @Nathaniel Virgo because I remember we talked about comb factorizations.

view this post on Zulip Dylan Braithwaite (Dec 20 2023 at 14:26):

Dario Stein said:

Interesting thread! I've just started looking into traces because they seem relevant to comb factorizations as used here. In order for comb application to be well-defined, we want an implicational axiom like the following to hold [...]

For what it's worth I think you can talk about comb factorisations in a more general setting by defining 2-combs as optics.

The comb-factorisation from that paper then factorises a state p:IABCp: I \to A \otimes B \otimes C into an optic (IC)(AB)\binom I C \to \binom A B and a morphism ABA \to B (equivalently an optic (AB)(II)\binom A B \to \binom I I).

Since you can have inequal optics (IC)(AB)\binom I C \to \binom A B which induce equal morphisms BACB \to A \otimes C this sidesteps the need for the extra coherence axiom for application.

view this post on Zulip Dario Stein (Dec 20 2023 at 17:31):

Interesting, is the optics approach sufficient to replicate the arguments in the paper? In the end, the goal is to prove certain equalities between morphisms, no?

view this post on Zulip Dylan Braithwaite (Dec 20 2023 at 18:05):

I haven't worked through the arguments in the final section in detail, but I think it all works the same. In either case "2-combs" are just an intermediate structure used in the definition of an intervention, that you end up mapping back into the underlying Markov category

view this post on Zulip Dylan Braithwaite (Dec 20 2023 at 18:16):

There's maybe a neater way to see the relationship between the notion of combs as optics and the notion of combs using a trace in the paper, using the fact that optics somehow represent feedback operators on a category. That is, a feedback structure on a category (ie a trace without the yanking axiom) is equivalently a factorisation of the hom functor Cop×CSet\cal C^\text{op} \times C \to \mathbf{Set} through the inclusion Cop×COptic(C)\cal C^\text{op} \times C \to \mathbf{Optic}(\mathcal C). But I don't see how to make a precise connection right now

view this post on Zulip Sam Staton (Dec 21 2023 at 20:49):

Hi @Dario Stein, the theorem gives a nice new angle, thanks.
The trace I think we have implemented in the LazyPPL library is given by freezing the random seed and then using Haskell's recursion. I am using the fact that every type is pointed (lifted).
We also have equality test and uniform distributions in the library, so I wonder whether you have now shown that it is actually not a trace, or if there is a subtle point about divergence.

(The reason I haven't actually proved whether it is a trace is that I am still not sure what the categorical semantics of Haskell should be, once you consider non-termination.)

view this post on Zulip Nathaniel Virgo (Jan 02 2024 at 04:58):

Sorry for being late to this thread, @Dario Stein. Thank you for this! Indeed it completely answers the question I was asking people at ACT - in the negative, which is really surprising to me.

What I was trying to do was to start from the view that combs are optics, but then show that this reduces to something simpler in the case where the underlying category has conditionals, in a similar way to how optics reduce to lenses in a cartesian monoidal category. To make that work I thought I seemed to need an axiom similar to the one you posted. There might be some other way to do it now I know that that doesn't work, though, I'm not sure.