You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Hi folks,
Following on the discussion in #learning: questions > Deepening Understanding of Petri Nets, I have picked up with reading the textbook Quantum Techniques for Stochastic Mechanics by Baez and Biamonte! This new stream is for discussing the book and posting my questions and thoughts on the read as I progress through it. Thanks for the suggestion @Owen Lynch ! Now, onto the maths.
Having been in the public health research space for a while, I was quite excited when, starting on page 24, the discussion of petri nets shifted to SI, SIR, and SIRS, models. As I read through the petri net theoretic treatment of these models, I couldn't help but wonder how these sorts of models differ from compartmental models (which I have studied in the past). For example, the differential equations that petri nets give for the SIR model are exactly the same governing equation that I see when I consider these models from a compartmental modeling sense.
I was happy to see the warning on page 27, "'compartmental model[s' are] closely related to stochastic Petri nets, but beware: [they] are not really Petri nets!" I fully respect the distinction and that the diagrams themselves (and their semantics) look different. But, to what extent are Petri Nets and Compartmental Models different? It seems like there exists the notions of mass transfer, balance laws, and I would assume similar techniques to create and solve an initial value problem for a given model.
Tangent: the joke, "[SIR Models are] the only mathematical model we know to have been knighted: Sir Model", was most excellent! Between the battle bunnies in the beginning, the bits of color commentary, it's a really accessible read!
But, to what extent are Petri Nets and Compartmental Models different?
Stochastic Petri net models (typically) use the 'law of mass action' to describe the rate at which transitions occur, while other compartmental models, like stock and flow models, are much more flexible in their description of these rates. You can read about the law of mass action in our book.
On the other hand, compartmental models - or at least stock and flow models - only allow one in, one out transitions, which are called 'flows', while Petri nets allow in, out transitions.
Let me give a link to the book, just for people who might want to join this thread:
Hey @John Baez , just wanted to say thanks for the response to my question! That answered it and helped me contextualize petri nets a bit better in landscape of diagrammatic languages
So, a question that I’ve run up against is thinking about the “master equation” introduced in section 3 of the book. The test your understanding problem was this:
“Suppose we have a stochastic Petri net with k species and one transition with rate constant r. Suppose the ith species appears times as the input of this transition and times as the output. A labeling of this stochastic Petri net is a k-tuple of natural numbers saying how many things are in each species. Let be the probability that the labeling at time t. Then the master equation looks like this:
For some matrix of real numbers . What is this matrix?”
After thinking through this for quite some time, I’m a bit stumped at trying to think through what this matrix is or what I should be trying to say for this problem .
So far, using the hints in the section, I’ve come up with the following idea:
As there are k species with their initial counts given in the tuple, , I surmised that for any element in this matrix, the value is the initial species count divided by the respective falling power for that species. I thought the matrix would be comprised of columns that denoted one species, and rows that represented all possible distinguishable members of that species divided by that species’ falling power.
As a result, I thought that this matrix represented some kind of means to determine, alongside the probability function, the probability of a particular species labeling at any given time point.
It seems to me that the matrix actually stays static given the constant rate r and the possible inputs and outputs used to calculate the probability function.
I’m not satisfied with this answer yet, but I’m actively trying not to look at the answer given as I want to make sure I understand concretely the master equation.
Could anyone help me with my reasoning here or what I am missing?
For what it’s worth, I don’t think I’m getting stuck in the quantum field theory at this point
Jacob Zelko said:
As there are k species with their initial counts given in the tuple, , I surmised that for any element in this matrix, the value is the initial species count divided by the respective falling power for that species. I thought the matrix would be comprised of columns that denoted one species, and rows that represented all possible distinguishable members of that species divided by that species’ falling power.
I feel your difficulty with this question may be a difficulty in understanding what sort of thing the matrix is.
As the problem states, both rows and columns of the matrix are indexed not by species or "distinguishable members of species" (whatever those are), but by labelings . The matrix is the rate at which the probability of the Petri net having labeling changes, as as a function of time, given that it has labeling .
So I'd look at the examples in the book more carefully, and think about trying to generalize those. If you give up, the book has answers to all the problems. But I think it would be good to write down some formula
before you look at the answer.
Even if your formula is wrong, trying to write it down will be good.
John Baez said:
but by labelings ℓ. The matrix Hℓ′ℓ is the rate at which the probability of the Petri net having labeling ℓ′ changes, as as a function of time, given that it has labeling ℓ.
Actually, what is ℓ′? Is that the whole game here and asking that question spoils my attempt at the problem? Or is that something which should be understood implicitly?
I took ℓ to be a bespoke labeling, and ℓ' to be a bespoke labeling after a certain time (almost analogous to an initial state at t=0 and some state later after t).
John Baez said:
I feel your difficulty with this question may be a difficulty in understanding what sort of thing the matrix Hℓ′ℓ is.
I very much agree! I am giving this another attempt but knowing that we are looking for some kind of formula was a great nudge.
and range over all possible labelings.
We're trying to understand how a probability distribution on labellings changes with time. It changes linearly:
where is a linear operator. We can always take a linear operator and write it as a matrix, so this is short for
Here's how I said this equation in plain English:
The matrix is the rate at which the probability of the Petri net having labeling changes, as as a function of time, given that it has labeling .
This "given that" is a reference to the idea of conditional probability, the probability that something happens given that something else is true.
So, given that your Petri net has some arbitrary labeling , we want to know the probability per time that a transition will occur that changes the labeling to some other arbitrary labeling, say . That's what the number tells us.
So your job in the puzzle is to tell me a formula for all these numbers . Of course they will depend on what transitions are in your Petri net, and what rate constants they have.
All of this was supposed to be clear from what I wrote in the book, but if it's not - well, that just shows how people who write books tend to be talking to imagined readers who are somewhat different from their actual readers.
John Baez said:
So, given that your Petri net has some arbitrary labeling ℓ, we want to know the probability per time that a transition will occur that changes the labeling to some other arbitrary labeling, say ℓ′.
Ah! Ok, now I am understanding more why you were introducing Feynman diagrams! My thoughts about why you brought them in were tenuous but now framing up this problem more explicitly is helping me get a better sense of why it is discussed here.
John Baez said:
So your job in the puzzle is to tell me a formula for all these numbers . Of course they will depend on what transitions are in your Petri net, and what rate constants they have.
Interesting. I had originally thought to think of the rate equation and just scrutinize the single transition in this problem. Doing an approach where each term in a given rate equation comes from a transition. But given that the chapter started with saying "For this we need to forget the rate equation (temporarily) and learn about the 'master equation'", I just mostly ignoring section 2. Was that a bit too overzealous?
Studying the rate equation doesn't really help you understand what the master equation is. I see now that I threw out this problem very abruptly, before doing any examples. So I'd say this is a hard problem. Section 5 gives tons of examples, but I handle them using a bit more machinery.
Oh alright; would you suggest then I pause on this problem, read through sections 4 and 5, and then come back to this one?
Either that or just read the answer. There will be easier puzzles.
Ok, I am not ready to concede defeat just yet so I will push on through those other sections and come back later. Thank you so much John!
John Baez said:
but if it's not - well, that just shows how people who write books tend to be talking to imagined readers who are somewhat different from their actual readers.
I just wanted to say a quick aside in that, perhaps I might just not be that ideal reader as I come more from an engineering/health research background. Sure I've been exposed to various kinds of diagrams and modeling and a little bit of quantum field theory, but I've never had a full course in quantum field theory before.
As you'll see, a lot of the initial portions of this book are not about explaining Petri nets: they're about a weird new analogy between quantum theory and probabillity theory, where we take amplitudes as analogous to probabilities (which is, prima facie, an insane thing to do). That's why the book is called Quantum Techniques for Stochastic Mechanics
So if you wanted a nice friendly introduction to Petri nets, this is not that. But unfortunately I don't know any introduction to Petri nets that I like better!
This thread has piqued my interest in this book! I hope it's helpful (and not distracting) if I post here a question about part of section 3.
On page 33, we are contemplating this Petri net:
Petri net
The book asks this question:
Suppose there are 10 hydrogen atoms and 5 oxygen atoms. How many ways can they form a water molecule?
The book says that there are ways, and then states:
Note that we're treating the hydrogen atoms as distinguishable, there are are ways to pick them, not ...
Even though we're treating the hydrogen atoms as distinguishable, I would have thought that there would still be only ways to pick two hydrogen atoms. Let us label each hydrogen atom with a number from 1 to 10, to emphasize that the hydrogen atoms are distinguishable, and to facilitate mention of a particular atom.
Then, if I don't divide by two, then I think I must be counting these two ways of picking hydrogen atoms as different:
In each case above, I end up picking the -labelled hydrogen atom and the -labelled hydrogen atom. So, the set of hydrogen atoms involved in the transition is the same in each case.
However, it seems like we wish to count these two ways of selecting hydrogen atoms as different, if I'm not misunderstanding. It's unclear to me why we would want to do this, though.
Is the idea perhaps that the order of selection of an atom corresponds to its particular role in the transition? For example, perhaps the first-selected hydrogen atom plays a different role in the transition than the second-selected hydrogen atom?
Here's a picture to visualize that idea:
picture
This picture aims to illustrate the idea that the two selected hydrogen atoms can participate in the transition in two different ways. Namely, a given selected hydrogen atom can end up in two different places in the resulting water molecule. So, from this perspective, I think it does make sense to keep track of the order in which we pick each hydrogen atom selected!
(I think I may have answered my question!)
I think you hit the nail on the head @David Egolf.
My understanding of this, if we jump ahead a bit from the current section, is that there are different philosophies in "counting" markings. If we went with your initial thought of there being ((10 * 9) / 2) ways, then my understanding is that falls more under a "collectivist" sort of notion. With these sort of "boring old petri nets", instead we do care about the individual selection of each species. I know John explained this better in his blog post here:
https://golem.ph.utexas.edu/category/2021/01/categories_of_nets_part_1.html
Additionally, underlying this book is a hidden flavor of Category Theory that I think gets a little bit of a treatment later towards the end of the book which gets into also why the ability to distinguish all selections is important in a petri net.
David Egolf said:
This thread has piqued my interest in this book! I hope it's helpful (and not distracting) if I post here a question about part of section 3.
Yay! More the merrier and the more to discuss! :smiley:
Actually the issue @David Egolf raises is quite subtle and I didn't understand it as well when I wrote this book. I agree that it's peculiar to treat water as being potentially made in two different ways depending on which hydrogen goes in which "slot", as David so kindly illustrated:
I don't want to defend this as being physically realistic. I'm not sure it is, and digging into this would involve issues about Maxwell-Boltzmann, Bose-Einstein and Fermi-Dirac statistics that I don't want to get into. Instead, I just want to affirm that this is the approach the book is taking, and also this is the usual approach chemists take when discussing the rate equation of a Petri net - i.e., this approach is not peculiar to me.
I also want to add two other things:
1) Suppose you don't like the idea that the rate equation for
H + H + O H₂O
involves a factor of where is the number of H's present and is the number O's. Suppose you prefer a factor of . Then you can just divide the rate constant by 2, and that will have the same effect.
2) In fact the whole reaction
H + H + O H₂O
is chemically unrealistic, or at least unlikely to occur. It's more realistic to break this up like
H + O HO
H + HO H₂O
or
H + H H₂
H₂ + O H₂O
or some other collection of reactions with only two molecules or ions as the inputs at each stage. The details depend on the situation (an ionized gas? reactions happening in liquid water?) but my main point is that it's rare, in chemistry, for three entities to meet each other at once and engage in a reaction.
@Jacob Zelko wrote:
With these sort of "boring old petri nets", instead we do care about the individual selection of each species. I know John explained this better in his blog post here....
Actually if you read those blog articles you'll see I claim the "boring old" Petri nets follow the collective token philosophy. This seems to contradicts what I've just been saying, and in a sense it does, but it's all quite subtle, so I think for now the best thing - since we're talking about the book - is to focus on what the book is saying and not worry about what some annoying guy wrote in some blog articles about another paper. :upside_down:
John Baez said:
I don't want to defend this as being physically realistic. I'm not sure it is, and digging into this would involve issues about Maxwell-Boltzmann, Bose-Einstein and Fermi-Dirac statistics that I don't want to get into. Instead, I just want to affirm that this is the approach the book is taking, and also this is the usual approach chemists take when discussing the rate equation of a Petri net - i.e., this approach is not peculiar to me.
That makes sense! Thanks for clarifying! It would be somewhat interesting to know the rationale that chemists give for this modelling approach. (Does it perhaps enable making better quantitative predictions for certain experiments?)
John Baez said:
In fact the whole reaction
H + H + O H₂O
is chemically unrealistic, or at least unlikely to occur. ... The details depend on the situation (an ionized gas? reactions happening in liquid water?) but my main point is that it's rare, in chemistry, for three entities to meet each other at once and engage in a reaction.
That also makes sense! So, we are already making a bit of an unrealistic model for the situation from the beginning. I suppose there is a huge amount of possible detail that could be incorporated to try and answer the question "How many ways can water form from hydrogen and oxygen?"
For example, one could keep track of the "trajectory" of each atom involved (pretending that atoms have a certain "position" at all times, which I am guessing is not quite true) as they collide with various other atoms, and eventually form part of a water molecule. If all this information is retained, then from this perspective there would be a huge number of possible ways to form a water molecule.
Jacob Zelko said:
David Egolf said:
This thread has piqued my interest in this book! I hope it's helpful (and not distracting) if I post here a question about part of section 3.
Yay! More the merrier and the more to discuss! :smiley:
Awesome! I may chime in now and then!
David Egolf said:
It would be somewhat interesting to know the rationale that chemists give for this modelling approach. (Does it perhaps enable making better quantitative predictions for certain experiments?)
I think the main rationale is that it makes absolutely no difference.
For simplicity let me just talk about our favorite example:
If we choose a positive number called the rate constant for the transition here (the blue box), we can write down two kinds of differential equation describing the process of two hydrogen atoms and an oxygen atom turning into a molecule of water: the rate equation and the master equation.
For both of these equations, chemists use a convention where the equation involves a term
where is the number of hydrogen atoms and is the number o oxygen atoms. Here is the number of ordered pairs of distinct hydrogen atoms.
We might alternatively use the convention where this term is
Here is the number of unordered pairs of distinct hydrogen atoms.
But, if we change conventions like this, nothing happens if we also double the rate constant .
This works quite generally: if you change your mind about counting ordered versus unordered tuples, the rate or master equation stays the same if you also multiply the rate constant by the right number!
The rate constant has no inherent significance except for its role in the rate or master equation.
So there is no advantage to either approach except perhaps mathematical convenience. The approach that chemists use, with ordered tuples, is more convenient because it avoids certain factors of that show up if you use unordered tuples. You wind up saving a lot of ink if you use the chemists' approach. This is good for the environment.
John Baez said:
This works quite generally: if you change your mind about counting ordered versus unordered tuples, the rate or master equation stays the same if you also multiply the rate constant by the right number!
I see! Thanks for explaining that!
I suppose we could absorb the by redefining appropriately. But maybe this would be bad, somehow?
Ah, that wouldn't work! If we did that, we'd need to redefine every time we have a different number of hydrogen or oxygen atoms available. And we don't want to do that, as those quantities could indeed change as the system we are modelling evolves over time.
David Egolf said:
I suppose we could absorb the by redefining appropriately. But maybe this would be bad, somehow?
Ah, that wouldn't work! If we did that, we'd need to redefine every time we have a different number of hydrogen or oxygen atoms available. And we don't want to do that, as those quantities could indeed change as the system we are modelling evolves over time.
Right, that would not work at all.
By the way, in many situations the number of each kind of molecule is roughly , so it's fine to approximate by , or more generally falling powers by ordinary powers.
But you should not treat them as constant in time!
Hey folks! Had to step away for a few days due to being busy and reading through chapter 5 a few times. But now I am back!
About this section, I had a variety of questions so some of them will be more high level and some more low level. This is the section where I am starting to have my mind blown by the analogy between quantum field theory and stochastic problems -- the fact you could relate these two areas together to me is fascinating!
When it says that the "creation and annihilation operators don't commute" and introduces the commutator, I was confused by what the commutator actually is given the definition:
which is the same as or, more generally, .
What is this commutator of two operators?
One of the things I got from this section was the relationship between the diagrams and the quantum physics equations being used. For example:
What my question is regarding this chapter is if, looking at the diagrams, am I supposed to take away how easy the analogy from the diagrams map onto the equations used themselves? And how the creation operator is analogous to an output of the diagram and the annihilation operator is analogous to the inputs into a diagram?
And furthermore, thinking about the non-commutativity of the quantum operators here, that the inputs and outputs in a given diagram are irreversible (i.e. I can't flip the arrows of the breeding rabbits example as that doesn't make sense per the context) being analogous to the non-commutativity of the quantum operators?
I suppose my question broadly is, is what intuition am I supposed to be getting regarding looking at the equations and looking at the diagrams?
As for some more broad questions, I had two additional ones:
Does the game of working with a master equation for a given situation generally boil down to figuring out how to represent the Hamiltonian? It seems like that was a crucial focus in section 5.
It is so strange and interesting to me that one can make the analogy between quantum physics and stochastic problems like rabbit population growth so strongly. I suppose more philosophically, how can the same equations developed to model quanta of energy be used to model population growth?
I'll freely admit that the second question here was the mind-blowing part of seeing this come together -- just how is this even possible?
Oh and one final question for now: what makes the Number operator, , so special?
Jacob Zelko said:
When it says that the "creation and annihilation operators don't commute" and introduces the commutator, I was confused by what the commutator actually is given the definition:
which is the same as or, more generally, .
I don't see anything quite so confusing in my book! On page 50, the first place the word "commutator" shows up, we wrote:
where the commutator of two operators is [S, T] = ST − TS.
When you see a term in boldface, you know we're defining it. Here we are defining the commutator of two operators S and T to be ST - TS, and abbreviating it as [S,T].
If I were being very formal I would have said
where the commutator of two operators S and T is ST − TS, abbreviated as [S,T].
What my question is regarding this chapter is if, looking at the diagrams, am I supposed to take away how easy the analogy from the diagrams map onto the equations used themselves?
Yes, it's pretty easy, though one might naively guess the wrong formula, and we talk about the wrong formula and how to correct it.
Does the game of working with a master equation for a given situation generally boil down to figuring out how to represent the Hamiltonian?
I would instead say that you don't even know what the master equation
says until you know what the Hamiltonian equals. "Working with" the equation is something we do throughout the whole book, and you'll see there are many tricks for that.
It is so strange and interesting to me that one can make the analogy between quantum physics and stochastic problems like rabbit population growth so strongly. I suppose more philosophically, how can the same equations developed to model quanta of energy be used to model population growth?
They're not exactly the same equations: in quantum mechanics the Hamiltonians should be self-adjoint, while in stochastic mechanics they should be infinitesimal stochastic. We explained that in Sections 4.3 and 4.4. But in both cases we can often build these Hamiltonians using creation and annihilation operators.
If I had a good one-paragraph answer to why this is possible, I would have included that paragraph in the book. But since it seems "amazing yet true", I wanted to write a whole book about it. I'm glad you find it mind-blowing: it was mind-blowing to me too!
Jacob Zelko said:
Oh and one final question for now: what makes the Number operator, , so special?
I explained that at the bottom of page 52: if we apply it to a state with rabbits, it multiplies that state by the number of rabbits:
.
It's very famous in physics, for precisely this reason, but with 'quanta' replacing 'rabbits'.
John Baez said:
If I were being very formal I would have said
where the commutator of two operators S and T is ST − TS, abbreviated as [S,T].
Yea, that makes sense. Maybe a better way to have phrased my question plainly now that I've had a day to think on it is, "what is the commutator saying?" Because to me, based on what you said on page 50, it seems like a nice compact way to express the non-commutativity of two bespoke operators. Kinda as a short-hand reference. Is that accurate or is it saying something much different?
John Baez said:
What my question is regarding this chapter is if, looking at the diagrams, am I supposed to take away how easy the analogy from the diagrams map onto the equations used themselves?
Yes, it's pretty easy, though one might naively guess the wrong formula, and we talk about the wrong formula and how to correct it.
Ah got it. Again, just as another aside, it is quite fascinating how close even a coarse "guess" of how one might want to express a diagram in the language of quantum field theory gets you to the actual correct answer (I'm thinking about the "capturing rabbits" diagram given how close a "rough guess" actually gets you).
Aside: I don't want to derail too much but it reminds me of some of the approaches we took in my engineering courses where we took an expression, say Bernoulli's principle, and applied certain simplifications to be able to solve a resulting fluid dynamics problem. Of course, those simplifications would reduce the accuracy of the solution but we could always try to account for a correction later. In that setting, we would guess what we can throw out from an equation and in this setting we guess what we can throw into an equation. Just intriguing to see the different perspectives in this situation.
John Baez said:
It's very famous in physics, for precisely this reason, but with 'quanta' replacing 'rabbits'.
Ah fair enough! I just thought there was much more to it than what you had written but I guess it is a nice touchstone that you encounter every once in a while while working in quantum field theory. Cool!
Anyways, thanks for all the responses @John Baez! I'll keep digging in!
(Also @David Egolf thanks for showing me the emoji. It is amazing and :joy: )
Jacob Zelko said:
John Baez said:
If I were being very formal I would have said
where the commutator of two operators S and T is ST − TS, abbreviated as [S,T].
Yea, that makes sense. Maybe a better way to have phrased my question plainly now that I've had a day to think on it is, "what is the commutator saying?"
Oh, wow - that's a completely different question. You said you didn't know what the commutator "actually is". I thought you were pointing out some horrible deficiency of my book, that I never clearly defined the commutator.
Obviously the commutator [S,T] = ST - TS says how much two operators fail to commute. But that doesn't explain why the commutator is so important in math and physics: it's more useful than that.
Here is one of those cases where I look at Wikipedia, hoping they'll give a decent explanation so I don't have to... and I don't find one.
Briefly:
1) In quantum mechanics we prove that in any state , the standard deviation in the measurement of an observable (self-adjoint operator) times the standard deviation in the measurement of an observable is
This is called the uncertainty principle and you can find a proof by following the link - but unfortunately they prove a more complicated inequality from which this one easily follows.
2) The position and momentum observables, called and in quantum mechanics, obey the canonical commutation relations
where is Planck's constant (which equals for pure mathematicians).
This is a fundamental fact, probably discovered by Heisenberg, that makes quantum mechanics work the way it does.
3) Putting 1) and 2) together, and using the fact that any state has , we get:
In any state of a particle, the standard deviation of position times the standard deviation of momentum is
This is called the Heisenberg uncertainty principle.
4) Here's another fact, perhaps more important than the above, though harder to explain - so I won't really try: see Section 4.4 of the book for details.
I will now be a mathematician and set .
In quantum mechanics any observable gives rise to a 1-parameter group of unitary operators . If is the energy or Hamiltonian then describes how states evolve in time. If we evolve a state in time for some amount of time we get the state . If we then measure some other observable , the expected value of the result is
So, we think of
as the observable measured after waiting a time .
What does this have to do with commutators? Here's what:
It's a fun little calculation to show this. But it's important: it says the rate at which an observable changes with time is given by its commutator with the Hamiltonian.
This stuff works not only for quantum mechanics but also for stochastic mechanics - with some small changes sketched in Section 4.4.
So yes, you can use commutators to study how something like the population of rabbits changes with time!
We get into this more in Chapter 10 of the book.