Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: event: ACT20

Topic: July 7: John Baez et al.'s talk


view this post on Zulip Paolo Perrone (Jul 02 2020 at 02:16):

Hello all! This is the thread of discussion for the talk of John Baez and Kenny Courser, "Coarse-graining open Markov processes".
Date and time: Tuesday July 7, 20:40 UTC.
Zoom meeting: https://mit.zoom.us/j/7055345747
YouTube live stream: https://www.youtube.com/watch?v=5Rs_7ZCH1Wc&list=PLCOXjXDLt3pZDHGYOIqtg1m1lLOURjl1Q

view this post on Zulip John Baez (Jul 03 2020 at 14:28):

My talk is based on this paper:

view this post on Zulip John Baez (Jul 03 2020 at 23:10):

My talk slides are here:

view this post on Zulip Paolo Perrone (Jul 07 2020 at 20:30):

10 minutes!

view this post on Zulip John Baez (Jul 07 2020 at 21:12):

Here is Kenny Courser's thesis:

view this post on Zulip John Baez (Jul 07 2020 at 21:18):

David Jaz Myers wrote:

Idk about any black boxing functors yet but I'm very excited to find them.

Cool! To get the black boxing functor to work for the 2-morphisms of open Markov processes, we needed to demand that the squares in these 2-morphisms be pullbacks.

Then, to make sure that these 2-morphisms were closed under horizontal composition, we needed the legs in on our open Markov processes to be monic.

Then we used that van Kampen cube to get the job done.

So, all the complications arose from wanting a black boxing double functor.

view this post on Zulip John Baez (Jul 07 2020 at 21:18):

There might be a better way to do all this, but anyway: for us, black boxing is where things get interesting.

view this post on Zulip David Jaz (Jul 07 2020 at 22:20):

John Baez said:

Cool! To get the black boxing functor to work for the 2-morphisms of open Markov processes, we needed to demand that the squares in these 2-morphisms be pullbacks.

Then, to make sure that these 2-morphisms were closed under horizontal composition, we needed the legs in on our open Markov processes to be monic.

Then we used that van Kampen cube to get the job done.

So, all the complications arose from wanting a black boxing double functor.

Very cool! So, as a proof of concept for using indexed categories as a "home" for categorical applications to dynamical systems, I defined an indexed double category of open Markov systems as you talked about them in this paper. Then I started to make an indexed double functor that "takes the Master equation" of an open Markov system, namely it goes from this indexed double cat to the indexed double cat of dynamical systems in the doctrine of first order ODEs on Euclidean spaces. I also had to keep asking for things to be pullbacks until adhesion kicked in for this to work.

Here's the construction. We will construct Mark : Rhizome --> Cat. Rhizome is my cute name for a double category whose objects are finite sets (I'll come to the rest in a bit, and explain the name). This assignment will have Mark(B) be the category of Markov processes with boundary set (undirected) B, and morphisms the Markov process morphisms (coarse grainings) which act as the identity on the boundary B.

Now, the vertical morhpisms of Rhizome are what I call rhizome diagrams. They are cospans X -->> N <--< Y whose left leg is surjective and whose right leg is injective. We draw them as "bubble diagrams" (or "little potato diagrams", hence "rhizomes") in the usual way for cospans, but the conditions ensure that every outer port connects to some inner port. I started using this because its my personal favorite diagram category, but its useful here because of adhesion: the pushouts you get form composition are also pullbacks! One of these acts as Mark(X) --> Mark(Y) by sending a system (S, H, i : B >--> S) to (S', p_* H p^*, i') where S' is the pushout of i : X >--> S over X -->> N with p : S -->> S' the resulting map, and where i' : Y >--> N >--> S' is the composite.

The horizontal morphisms of Rhizome are functions. Mark(f) is the profunctor sending a systems S, S' to the set of morphisms S --> S' which act as f on the boundary.

A square in Rhizome is weird (and here's where I started added extra things until it worked). It is a normal map of cospans, but we ask that both squares are pullbacks. We need all these pullbacks around so we can use "Beck-Chevellay for vectors".

So if (CoKleisli(R^n x -), T) is the first order ODE doctrine, then we can make a double functor Rhizome --> Interface like this: We send a boundary set B to the pair (R^B, R^B) where we think of the "bottom" R^B (I'll write it second) as the space of populations on B and the "top" R^B (I'll write it first) as the space of flows on B. Then we send a rhizome X -->> N <--< Y to the lens given by pushing and pulling in the only way the signatures allow for: what this amounts to is saying that we add populations and flows when boundary nodes are glued together. The rest is kind of follow your nose; it all works by applying "Beck-Chevellay" a million times.

The indexed part of the indexed double functor takes the Master equation. I need to check functoriality more carefully here, because things are subtle. Hopefully it all follows from Kenny and your work!

view this post on Zulip David Jaz (Jul 07 2020 at 22:21):

Oops, I quoted everything...

view this post on Zulip John Baez (Jul 07 2020 at 22:26):

Hi. Let me read this...

view this post on Zulip John Baez (Jul 07 2020 at 22:27):

Btw, we use something I call "Beck-Chevalley for vectors" in our work on open Markov processes... that's why we want pullback squares of finite sets.

view this post on Zulip John Baez (Jul 07 2020 at 22:27):

Also btw, it's "master equation", not "Master equation" - it was not invented by my student Jade Master. :upside_down:

view this post on Zulip John Baez (Jul 07 2020 at 22:30):

A square in Rhizome is weird (and here's where I started added extra things until it worked). It is a normal map of cospans, but we ask that both squares are pullbacks. We need all these pullbacks around so we can use "Beck-Chevellay for vectors".

I assume "normal" means "plain old ordinary" here... this sounds exactly like what we do in our maps between open Markov processes. Lemma 4.3 of Coarse-graining open Markov processes is what I call "Beck-Chevalley for vectors".

view this post on Zulip John Baez (Jul 07 2020 at 22:32):

We use this to prove Theorem 4.2, which says that we can horizontally compose maps of open Markov processes.

view this post on Zulip John Baez (Jul 07 2020 at 22:32):

So anyway, while I'm having a bit of trouble understanding what you're up to, it sounds very similar to some things we were doing.

view this post on Zulip David Jaz (Jul 07 2020 at 22:43):

John Baez said:

Also btw, it's "master equation", not "Master equation" - it was not invented by my student Jade Master. :upside_down:

Haha my bad

view this post on Zulip David Jaz (Jul 07 2020 at 22:44):

John Baez said:

A square in Rhizome is weird (and here's where I started added extra things until it worked). It is a normal map of cospans, but we ask that both squares are pullbacks. We need all these pullbacks around so we can use "Beck-Chevellay for vectors".

I assume "normal" means "plain old ordinary" here... this sounds exactly like what we do in our maps between open Markov processes. Lemma 4.3 of Coarse-graining open Markov processes is what I call "Beck-Chevalley for vectors".

Yes! Exactly. I was doing all this with your paper open, and all the hard work is in your paper. I just forgot you had called it exactly that :)

view this post on Zulip John Baez (Jul 07 2020 at 22:46):

Okay, cool, it'd be great to fit our work into your framework.

view this post on Zulip David Jaz (Jul 07 2020 at 22:46):

John Baez said:

So anyway, while I'm having a bit of trouble understanding what you're up to, it sounds very similar to some things we were doing.

My goal was to re-arrange your paper into the indexed double categorical framework, since that was what I was working with. I'll have to write it all up clearly :)

view this post on Zulip John Baez (Jul 07 2020 at 22:46):

Please do!

view this post on Zulip John Baez (Jul 07 2020 at 22:48):

Btw, I couldn't figure out how to do discrete-time Markov chains in this framework because when you try to compose two "open discrete-time Markov chains", you may identify two states, one of which has a 40% chance of hopping to A and a 60% chance of hopping to B, and another of which has a 50% chance of hopping to C and a 50% chance of hopping to D.

view this post on Zulip John Baez (Jul 07 2020 at 22:49):

All of a sudden now the solution seems easy, which is to think of these not as probabilities but "not-necessarily-normalized probabilities".

view this post on Zulip John Baez (Jul 07 2020 at 22:49):

Well, maybe not "easy" but "possible".

view this post on Zulip John Baez (Jul 07 2020 at 22:50):

It would be really cool to get a general framework for stochastic processes that could easily handle these two different concepts of time in a unified way, even down to the nitty-gritty details.

view this post on Zulip David Jaz (Jul 07 2020 at 22:54):

John Baez said:

It would be really cool to get a general framework for stochastic processes that could easily handle these two different concepts of time in a unified way, even down to the nitty-gritty details.

This sounds like a wonderful stress test for my doctrines!

I can say that for any strong monad morphism you get a morphism of doctrines of M-automata. So the unit always gives you a map from deterministic automata (id-automata) to any M-automata. Actually I was writing more but now I'm not so sure what the two concepts of time are? Could you say more?

view this post on Zulip David Jaz (Jul 07 2020 at 22:54):

Is it "discrete time" versus "continuous time (via ODE)"?

view this post on Zulip John Baez (Jul 07 2020 at 22:57):

Yes, I was just explaining the problems with discrete time, which made me write a paper about continuous time. I don't know if my explanation made sense.

view this post on Zulip John Baez (Jul 07 2020 at 22:58):

In continuous time it's very obvious how to add "probabilistic rates"; in discrete time the analogous thing seems to be adding "probabilities", but that can't be quite right.

view this post on Zulip David Jaz (Jul 07 2020 at 22:59):

Ah I see. I don't know if there is a "normalize the populations" strong monad morphism from say N --> D where N is the free commutative monoid morphisms and D is the probability morphisms (on finite sets... but then they're not really monads... relative monads? yikes).

view this post on Zulip David Jaz (Jul 07 2020 at 23:02):

Actually, wait, there is a function with that signature, but I'm not sure it commutes with multiplication. No it doesn't: (1(1) + 1 (1 + 1)) goes either to 1/2 1/4 1/4 or 1/3 1/3 1/3. This must be the problem you're talking about.

view this post on Zulip David Jaz (Jul 07 2020 at 23:05):

Well, I'm pretty sure that you can land in "unnormalize probabilities" aka populations, discretely. I do have a morphism of doctrines from continuous time to discrete time (the Euler method) which would work because the changes to boundary populations are affine :)

view this post on Zulip James Fairbanks (Jul 08 2020 at 01:04):

So how does that work? you discretize the time dimension by assuming the system is locally linear on intervals shorter than Δt\Delta t and then take discrete jumps with uI+1=f(ui,ti)Δtu_{I+1} = f(u_i,t_i)\Delta t

view this post on Zulip Paolo Perrone (Jul 08 2020 at 02:03):

Here's the video!
https://www.youtube.com/watch?v=fDWcX8akYMI&list=PLCOXjXDLt3pYot9VNdLlZqGajHyZUywdI

view this post on Zulip John Baez (Jul 08 2020 at 05:15):

The thing is, when probabilities change in discrete time they change multiplicatively, not additively:

pi(t+1)=jAijpj(t)p_i(t+1) = \sum_j A_{i j} p_j(t)

where AijA_{ij} is a so-called stochastic matrix.

view this post on Zulip Esa Pulkkinen (Jul 08 2020 at 09:58):

I wonder if in this context the convolution style summing is used often? Something what I think is needed often for infinite streams used to represent generating functions: sn=i+j=naibjzns_n = \sum_{i+j=n}{a_ib_jz^n} ? (see https://esapulkkinen.github.io/cifl-math-library/cifl-math-library/Math-Number-Stream.html#v:codiagonals for implementation). Perhaps it only works when the matrices are infinite?

view this post on Zulip David Jaz (Jul 08 2020 at 10:53):

James Fairbanks said:

So how does that work? you discretize the time dimension by assuming the system is locally linear on intervals shorter than Δt\Delta t and then take discrete jumps with uI+1=f(ui,ti)Δtu_{I+1} = f(u_i,t_i)\Delta t

Yes, exactly.

view this post on Zulip David Jaz (Jul 08 2020 at 10:55):

John Baez said:

The thing is, when probabilities change in discrete time they change multiplicatively, not additively:

pi(t+1)=jAijpj(t)p_i(t+1) = \sum_j A_{i j} p_j(t)

where AijA_{ij} is a so-called stochastic matrix.

Isn't this just approximating the master equation with $\Delta t = 1$?

view this post on Zulip Tomáš Jakl (Jul 08 2020 at 10:57):

Hi @John Baez it seems that the link to slides on your website is broken.

view this post on Zulip John Baez (Jul 08 2020 at 14:53):

I don't know which link has the problem, but the link should be this:

http://math.ucr.edu/home/baez/open_markov/open_markov.pdf

view this post on Zulip John Baez (Jul 08 2020 at 14:55):

David Jaz said:

John Baez said:

The thing is, when probabilities change in discrete time they change multiplicatively, not additively:

pi(t+1)=jAijpj(t)p_i(t+1) = \sum_j A_{i j} p_j(t)

where AijA_{ij} is a so-called stochastic matrix.

Isn't this just approximating the master equation with Δt=1\Delta t = 1?

Call it what you want, but when AijA_{i j} is a stochastic matrix this is the usual definition of a discrete-time Markov chain. Probabilities are conserved exactly, not approximately, and there are entire books on discrete-time Markov chains, so saying it's "just" an approximation to a continuous-time Markov chain strikes me as unduly dismissive. I'd say it's its own sort of thing.

view this post on Zulip John Baez (Jul 08 2020 at 14:59):

A continuous-time Markov chain is a continuous homomorphism from ([0,),+)([0,\infty), +) into a monoid of stochastic matrices. A discrete-time Markov chain is a homomorphism from (N+,+)(\mathbb{N}^+, +) into a monoid of stochastic matrices. I think this way of thinking about it is nice.

view this post on Zulip Tomáš Jakl (Jul 08 2020 at 20:57):

John Baez said:

I don't know which link has the problem, but the link should be this:

http://math.ucr.edu/home/baez/open_markov/open_markov.pdf

Thanks for the link, I meant the slides you mentioned on http://math.ucr.edu/home/baez/open_markov/ (the link at "You can see the slides here")

view this post on Zulip John Baez (Jul 08 2020 at 21:00):

Thanks! Fixed!

view this post on Zulip David Jaz (Jul 09 2020 at 19:56):

John Baez said:

David Jaz said:

John Baez said:

The thing is, when probabilities change in discrete time they change multiplicatively, not additively:

pi(t+1)=jAijpj(t)p_i(t+1) = \sum_j A_{i j} p_j(t)

where AijA_{ij} is a so-called stochastic matrix.

Isn't this just approximating the master equation with Δt=1\Delta t = 1?

Call it what you want, but when AijA_{i j} is a stochastic matrix this is the usual definition of a discrete-time Markov chain. Probabilities are conserved exactly, not approximately, and there are entire books on discrete-time Markov chains, so saying it's "just" an approximation to a continuous-time Markov chain strikes me as unduly dismissive. I'd say it's its own sort of thing.

Apologies, I didn't mean to put down the stochastic matrix with my use of "just". I guess its a bad habit I picked up from saying "just a monoid in the category of endofunctors" too many times :)

What I should have said was "precisely" or "exactly". And, thinking about it more, the answer to that is: no. The formula I would get would be

pi(t+1)=jHijpj(t)p_i(t + 1) = \sum_j H_{ij}p_j(t)

using HH and not AA! Again, as you mention above, the difficulty is normalizing the populations into probabilities.

view this post on Zulip Jorge Soto-Andrade (Jul 10 2020 at 00:09):

John Baez said:

The thing is, when probabilities change in discrete time they change multiplicatively, not additively:

pi(t+1)=jAijpj(t)p_i(t+1) = \sum_j A_{i j} p_j(t)

where AijA_{ij} is a so-called stochastic matrix.

Mmm, it seems to me that discrete and continuous time variation go under the same "multiplicative" umbrella. For instance, for the canonical random walk on a graph you have indeed sn+1=M(sn+1) s_{n+1} = M(s_{n+1})
where sms_m is the state function at time m m (giving the distribution of the random walker at time $m$) and M is the next neighbour averaging operator . But if you recall that the discrete Laplacian Δ\Delta is just Δ=MId \Delta = M - Id you get for the discrete derivative
sn+1sn1=Δ(sn) \frac{s_{n+1} - s_n}{1} = \Delta (s_n)
which is exactly the discrete analogue of Heat Equation dutdt=Δ(ut) \frac{du_t}{dt} = \Delta(u_t)
and so on...