You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Hello all! This is the thread of discussion for the talk of John Baez and Kenny Courser, "Coarse-graining open Markov processes".
Date and time: Tuesday July 7, 20:40 UTC.
Zoom meeting: https://mit.zoom.us/j/7055345747
YouTube live stream: https://www.youtube.com/watch?v=5Rs_7ZCH1Wc&list=PLCOXjXDLt3pZDHGYOIqtg1m1lLOURjl1Q
My talk is based on this paper:
My talk slides are here:
10 minutes!
Here is Kenny Courser's thesis:
David Jaz Myers wrote:
Idk about any black boxing functors yet but I'm very excited to find them.
Cool! To get the black boxing functor to work for the 2-morphisms of open Markov processes, we needed to demand that the squares in these 2-morphisms be pullbacks.
Then, to make sure that these 2-morphisms were closed under horizontal composition, we needed the legs in on our open Markov processes to be monic.
Then we used that van Kampen cube to get the job done.
So, all the complications arose from wanting a black boxing double functor.
There might be a better way to do all this, but anyway: for us, black boxing is where things get interesting.
John Baez said:
Cool! To get the black boxing functor to work for the 2-morphisms of open Markov processes, we needed to demand that the squares in these 2-morphisms be pullbacks.
Then, to make sure that these 2-morphisms were closed under horizontal composition, we needed the legs in on our open Markov processes to be monic.
Then we used that van Kampen cube to get the job done.
So, all the complications arose from wanting a black boxing double functor.
Very cool! So, as a proof of concept for using indexed categories as a "home" for categorical applications to dynamical systems, I defined an indexed double category of open Markov systems as you talked about them in this paper. Then I started to make an indexed double functor that "takes the Master equation" of an open Markov system, namely it goes from this indexed double cat to the indexed double cat of dynamical systems in the doctrine of first order ODEs on Euclidean spaces. I also had to keep asking for things to be pullbacks until adhesion kicked in for this to work.
Here's the construction. We will construct Mark : Rhizome --> Cat. Rhizome is my cute name for a double category whose objects are finite sets (I'll come to the rest in a bit, and explain the name). This assignment will have Mark(B) be the category of Markov processes with boundary set (undirected) B, and morphisms the Markov process morphisms (coarse grainings) which act as the identity on the boundary B.
Now, the vertical morhpisms of Rhizome are what I call rhizome diagrams. They are cospans X -->> N <--< Y whose left leg is surjective and whose right leg is injective. We draw them as "bubble diagrams" (or "little potato diagrams", hence "rhizomes") in the usual way for cospans, but the conditions ensure that every outer port connects to some inner port. I started using this because its my personal favorite diagram category, but its useful here because of adhesion: the pushouts you get form composition are also pullbacks! One of these acts as Mark(X) --> Mark(Y) by sending a system (S, H, i : B >--> S) to (S', p_* H p^*, i') where S' is the pushout of i : X >--> S over X -->> N with p : S -->> S' the resulting map, and where i' : Y >--> N >--> S' is the composite.
The horizontal morphisms of Rhizome are functions. Mark(f) is the profunctor sending a systems S, S' to the set of morphisms S --> S' which act as f on the boundary.
A square in Rhizome is weird (and here's where I started added extra things until it worked). It is a normal map of cospans, but we ask that both squares are pullbacks. We need all these pullbacks around so we can use "Beck-Chevellay for vectors".
So if (CoKleisli(R^n x -), T) is the first order ODE doctrine, then we can make a double functor Rhizome --> Interface like this: We send a boundary set B to the pair (R^B, R^B) where we think of the "bottom" R^B (I'll write it second) as the space of populations on B and the "top" R^B (I'll write it first) as the space of flows on B. Then we send a rhizome X -->> N <--< Y to the lens given by pushing and pulling in the only way the signatures allow for: what this amounts to is saying that we add populations and flows when boundary nodes are glued together. The rest is kind of follow your nose; it all works by applying "Beck-Chevellay" a million times.
The indexed part of the indexed double functor takes the Master equation. I need to check functoriality more carefully here, because things are subtle. Hopefully it all follows from Kenny and your work!
Oops, I quoted everything...
Hi. Let me read this...
Btw, we use something I call "Beck-Chevalley for vectors" in our work on open Markov processes... that's why we want pullback squares of finite sets.
Also btw, it's "master equation", not "Master equation" - it was not invented by my student Jade Master. :upside_down:
A square in Rhizome is weird (and here's where I started added extra things until it worked). It is a normal map of cospans, but we ask that both squares are pullbacks. We need all these pullbacks around so we can use "Beck-Chevellay for vectors".
I assume "normal" means "plain old ordinary" here... this sounds exactly like what we do in our maps between open Markov processes. Lemma 4.3 of Coarse-graining open Markov processes is what I call "Beck-Chevalley for vectors".
We use this to prove Theorem 4.2, which says that we can horizontally compose maps of open Markov processes.
So anyway, while I'm having a bit of trouble understanding what you're up to, it sounds very similar to some things we were doing.
John Baez said:
Also btw, it's "master equation", not "Master equation" - it was not invented by my student Jade Master. :upside_down:
Haha my bad
John Baez said:
A square in Rhizome is weird (and here's where I started added extra things until it worked). It is a normal map of cospans, but we ask that both squares are pullbacks. We need all these pullbacks around so we can use "Beck-Chevellay for vectors".
I assume "normal" means "plain old ordinary" here... this sounds exactly like what we do in our maps between open Markov processes. Lemma 4.3 of Coarse-graining open Markov processes is what I call "Beck-Chevalley for vectors".
Yes! Exactly. I was doing all this with your paper open, and all the hard work is in your paper. I just forgot you had called it exactly that :)
Okay, cool, it'd be great to fit our work into your framework.
John Baez said:
So anyway, while I'm having a bit of trouble understanding what you're up to, it sounds very similar to some things we were doing.
My goal was to re-arrange your paper into the indexed double categorical framework, since that was what I was working with. I'll have to write it all up clearly :)
Please do!
Btw, I couldn't figure out how to do discrete-time Markov chains in this framework because when you try to compose two "open discrete-time Markov chains", you may identify two states, one of which has a 40% chance of hopping to A and a 60% chance of hopping to B, and another of which has a 50% chance of hopping to C and a 50% chance of hopping to D.
All of a sudden now the solution seems easy, which is to think of these not as probabilities but "not-necessarily-normalized probabilities".
Well, maybe not "easy" but "possible".
It would be really cool to get a general framework for stochastic processes that could easily handle these two different concepts of time in a unified way, even down to the nitty-gritty details.
John Baez said:
It would be really cool to get a general framework for stochastic processes that could easily handle these two different concepts of time in a unified way, even down to the nitty-gritty details.
This sounds like a wonderful stress test for my doctrines!
I can say that for any strong monad morphism you get a morphism of doctrines of M-automata. So the unit always gives you a map from deterministic automata (id-automata) to any M-automata. Actually I was writing more but now I'm not so sure what the two concepts of time are? Could you say more?
Is it "discrete time" versus "continuous time (via ODE)"?
Yes, I was just explaining the problems with discrete time, which made me write a paper about continuous time. I don't know if my explanation made sense.
In continuous time it's very obvious how to add "probabilistic rates"; in discrete time the analogous thing seems to be adding "probabilities", but that can't be quite right.
Ah I see. I don't know if there is a "normalize the populations" strong monad morphism from say N --> D where N is the free commutative monoid morphisms and D is the probability morphisms (on finite sets... but then they're not really monads... relative monads? yikes).
Actually, wait, there is a function with that signature, but I'm not sure it commutes with multiplication. No it doesn't: (1(1) + 1 (1 + 1)) goes either to 1/2 1/4 1/4 or 1/3 1/3 1/3. This must be the problem you're talking about.
Well, I'm pretty sure that you can land in "unnormalize probabilities" aka populations, discretely. I do have a morphism of doctrines from continuous time to discrete time (the Euler method) which would work because the changes to boundary populations are affine :)
So how does that work? you discretize the time dimension by assuming the system is locally linear on intervals shorter than and then take discrete jumps with
Here's the video!
https://www.youtube.com/watch?v=fDWcX8akYMI&list=PLCOXjXDLt3pYot9VNdLlZqGajHyZUywdI
The thing is, when probabilities change in discrete time they change multiplicatively, not additively:
where is a so-called stochastic matrix.
I wonder if in this context the convolution style summing is used often? Something what I think is needed often for infinite streams used to represent generating functions: ? (see https://esapulkkinen.github.io/cifl-math-library/cifl-math-library/Math-Number-Stream.html#v:codiagonals for implementation). Perhaps it only works when the matrices are infinite?
James Fairbanks said:
So how does that work? you discretize the time dimension by assuming the system is locally linear on intervals shorter than and then take discrete jumps with
Yes, exactly.
John Baez said:
The thing is, when probabilities change in discrete time they change multiplicatively, not additively:
where is a so-called stochastic matrix.
Isn't this just approximating the master equation with $\Delta t = 1$?
Hi @John Baez it seems that the link to slides on your website is broken.
I don't know which link has the problem, but the link should be this:
http://math.ucr.edu/home/baez/open_markov/open_markov.pdf
David Jaz said:
John Baez said:
The thing is, when probabilities change in discrete time they change multiplicatively, not additively:
where is a so-called stochastic matrix.
Isn't this just approximating the master equation with ?
Call it what you want, but when is a stochastic matrix this is the usual definition of a discrete-time Markov chain. Probabilities are conserved exactly, not approximately, and there are entire books on discrete-time Markov chains, so saying it's "just" an approximation to a continuous-time Markov chain strikes me as unduly dismissive. I'd say it's its own sort of thing.
A continuous-time Markov chain is a continuous homomorphism from into a monoid of stochastic matrices. A discrete-time Markov chain is a homomorphism from into a monoid of stochastic matrices. I think this way of thinking about it is nice.
John Baez said:
I don't know which link has the problem, but the link should be this:
Thanks for the link, I meant the slides you mentioned on http://math.ucr.edu/home/baez/open_markov/ (the link at "You can see the slides here")
Thanks! Fixed!
John Baez said:
David Jaz said:
John Baez said:
The thing is, when probabilities change in discrete time they change multiplicatively, not additively:
where is a so-called stochastic matrix.
Isn't this just approximating the master equation with ?
Call it what you want, but when is a stochastic matrix this is the usual definition of a discrete-time Markov chain. Probabilities are conserved exactly, not approximately, and there are entire books on discrete-time Markov chains, so saying it's "just" an approximation to a continuous-time Markov chain strikes me as unduly dismissive. I'd say it's its own sort of thing.
Apologies, I didn't mean to put down the stochastic matrix with my use of "just". I guess its a bad habit I picked up from saying "just a monoid in the category of endofunctors" too many times :)
What I should have said was "precisely" or "exactly". And, thinking about it more, the answer to that is: no. The formula I would get would be
using and not ! Again, as you mention above, the difficulty is normalizing the populations into probabilities.
John Baez said:
The thing is, when probabilities change in discrete time they change multiplicatively, not additively:
where is a so-called stochastic matrix.
Mmm, it seems to me that discrete and continuous time variation go under the same "multiplicative" umbrella. For instance, for the canonical random walk on a graph you have indeed
where is the state function at time (giving the distribution of the random walker at time $m$) and M is the next neighbour averaging operator . But if you recall that the discrete Laplacian is just you get for the discrete derivative
which is exactly the discrete analogue of Heat Equation
and so on...