You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
image.png
I was reading Marc Harper's paper: inference as replicator dynamics. I don't understand why the second equality hold. It suggest because for instance . What am I missing? The original paper is here https://arxiv-benchmark.informatik.uni-freiburg.de/data/benchmark/pdf/0911/0911.1763.pdf
Ok never mind I found the answer in the paper. Average fitness is defined as . Interesting.
Yeah, if 99 people have fitness 0 and 1 has fitness 1 the average fitness is 1/100, not 50/100.
By the way, this is not "John Baez's paper with Marc Harper" - you can see the author is Marc Harper.
Oh my bad :sweat_smile:
What is the rationale behind this? It sounds reasonable to me, but I failed to explain it myself
So if you had 100 friends, and 99 of them earned $0/year, and 1 of them earned $1,000,000/year, what you say their average income was?
I see, it has to be weighted by population, not by categories. So $10,000.
Right. And if 99 people have fitness 0 and 1 has fitness 1 the average fitness is 1/100. This is because there are 100 people and they only have one child who survives.
@John Baez I'm thinking of extending this replicator dynamics to a bigger space e.g. the probability assignment over space of possible events instead of only the atomic events , while relaxing some basic assumptions of probability theory e.g. law of excluded middle instead of for all as it is usually assumed in probability theory to account for underparametrization. In this way the Bayesian dynamics / geometry are limiting case of this dynamics / geometry in a bigger space. Is there any references (maybe in geometry or dynamical systems?) that you would recommend to look into this problem? I think this is a problem that can be part of your initiative of uncertainty assessment for climate.
The problem seems more "topological" than it is "geometrical". Doesn't look like classical differential geometry?
I saw several links on piecewise linear manifold, not sure if they're relevant. A question can be what kind of piecewise linear manifold can recover Riemannian metric as a special / limiting case? It looks like a method used in quantum gravity a lot.
Sorry, I have no advice for you: your thoughts are too fragmentary for me to grasp them.
The question is how to generalize inference dynamics & geometry without the assumption to only work on probabilities of singleton events and assumption of probability theory as it is in Marc Harper's paper.
As an example for hypothesis space we're interested in studying the dynamics on but Marc Harper only studied dynamics on atomic events . If we assume law of excluded middle for then the higher-order events doesn't really need to be considered as their probabilities can be deduced. The question is what if we take out this assumption and see what happens geometrically.
Okay. But that's not really a math question. Generalizing information geometry to some weird version of probability theory where we drop the assumption is an open-ended research project, not a "question".
I would like to know if there are existing tools in mathematics (e.g. topology, category theory, etc.) that can handle this open-ended question. Dempster-Shafer theory is a generalization to Bayesian inference so I think the since Bayesian inference dynamics has good interpretation geometrically, so does Dempster-Shafer theory.
If you don't mind some advice: I think it would be very good for you to learn how to ask the usual sort of math question, where you either ask if some clearly well-formed statement is true, or ask how to prove some such statement. For example, "are all functions in also in ?", where the answer is "yes".
Hmm I think the question here is I'm trying to find in a math framework if any exists; if not invent one. Maybe the first step is to define it more precisely?
Yes. I have no idea what you're trying to do, so I can't help you do it. And to be very honest, I'm actually afraid you don't know what you're trying to do, either.
It might help if I knew Dempster-Shafer theory - maybe then I could guess what you're trying to do. But I don't know that stuff.
What criterion is "knowing what one's trying to do" satisfied? I want to know if stability of the inference dynamics of a well-known generalization of Bayesian inference can be studied in differential geometry, like it is studied in Bayesian inference by Marc Harper. But I think your advice mean to write up a formalism that embodies both Bayesian inference and Dempster-Shafer inference so it becomes a pure math problem.
I get the sense that knowing what to do means you already have an answer to a question. But then I guess you don't need to ask any questions.
Maybe someone who was an expert on Dempster-Shafer theory and information geometry could take a vague question like
I want to know if stability of the inference dynamics of a well-known generalization of Bayesian inference can be studied in differential geometry, like it is studied in Bayesian inference by Marc Harper.
and tell you something interesting about it. But I can't.
You often seem to ask very open-ended questions. They make me feel you're asking someone else to do your research for you.
Mathematicians can usually help more when you ask more precise questions - the sort of question you're able to ask when you have a specific plan, and you need to know if some particular statement is true.
John Baez said:
Mathematicians can usually help more when you ask more precise questions - the sort of question you're able to ask when you have a specific plan, and you need to know if some particular statement is true.
I believe this! But this mode of interaction with mathematicians does require figuring out helpful precise questions to ask. And I think learning how to ask such questions takes a lot of time, experience, and work. I wish I knew how to do this better myself.
For what it's worth, @Peiyuan Zhu, my theory is that learning how to ask precise questions (and develop a precise research plan) can be accomplished by (1) getting a solid basic foundation in the area you want to study (doing lots of exercises, and asking questions about those) and (2) reading and understanding in detail papers that people have written in related areas (and talking to people about these). I am still working on doing this myself, but my hope is that once (1) and (2) are accomplished it becomes easier to ask questions that are both interesting and sufficiently precise. Maybe such questions can be generated by modifying questions already asked and answered in previously published papers, at least to start with.
I'm sure many people here can offer a more insightful perspective on this process than me, though.
John Baez said:
Maybe someone who was an expert on Dempster-Shafer theory and information geometry could take a vague question like
I want to know if stability of the inference dynamics of a well-known generalization of Bayesian inference can be studied in differential geometry, like it is studied in Bayesian inference by Marc Harper.
and tell you something interesting about it. But I can't.
@Peiyuan Zhu a viable method is to distill the essential details of the thing you want to ask about into a summary that gives others the context needed to understand your question. For instance, rather than pasting several pages of a book or saying a name like "Dempster-Shafer theory" (which I also do not know the content of), give a paragraph summary of what you have understood or a specific example and point to the thing you don't understand. The longer the summary, the smaller the chance of engagement, but it will at least significantly lower the effort required by someone trying to engage with your question.
For this specific topic, you only have a small chance of getting a satisfying answer: either someone has tried it somewhere and someone reading this topic has seen that work and can point you to it, or no one here knows (which is likely: Dempster-Shafer theory isn't directly categorical, and is deep enough into probability theory that even the categorical probability people here may not have seen it) in which case you'll just have to try it for yourself and find out or ask somewhere else. This is a space for discussing category theory, we don't know everything!
Morgan Rogers (he/him) said:
John Baez said:
Maybe someone who was an expert on Dempster-Shafer theory and information geometry could take a vague question like
I want to know if stability of the inference dynamics of a well-known generalization of Bayesian inference can be studied in differential geometry, like it is studied in Bayesian inference by Marc Harper.
and tell you something interesting about it. But I can't.
Peiyuan Zhu a viable method is to distill the essential details of the thing you want to ask about into a summary that gives others the context needed to understand your question. For instance, rather than pasting several pages of a book or saying a name like "Dempster-Shafer theory" (which I also do not know the content of), give a paragraph summary of what you have understood or a specific example and point to the thing you don't understand. The longer the summary, the smaller the chance of engagement, but it will at least significantly lower the effort required by someone trying to engage with your question.
For this specific topic, you only have a small chance of getting a satisfying answer: either someone has tried it somewhere and someone reading this topic has seen that work and can point you to it, or no one here knows (which is likely: Dempster-Shafer theory isn't directly categorical, and is deep enough into probability theory that even the categorical probability people here may not have seen it) in which case you'll just have to try it for yourself and find out or ask somewhere else. This is a space for discussing category theory, we don't know everything!
I like the comment "Dempster-Shafer theory isn't directly categorical" because I can see that way it is used has deep categorical strucutres but it isn't immediate to see how it can be categorified -- it says that some modification of the theory is needed or a different perspective is needed.
I typesetted a research proposal explaining this in more details. Would it be suitable to post it here? Shall I move this to #practice: our work #practice: our papers channels?
@Peiyuan Zhu do you know an advisor / professor that know you better and could give you advices depending of your specific situation? If you want to do research on this subject, you have to take into account which person you can do it with etc... There are aspects which are not strictly mathematical and we don't have all the information to help you with that.
Been looking for some people to critique on this recently. Just hear back from several of them. Ready to meet with some of them next week.
I'm trying this evolutionary dynamical system on a simple coin tossing model to make sure I understand the concepts.
A coin tossing model is a fair coin, only has head .
Suppose we have prior .
Suppose we observe head .
Marc Harper's paper https://arxiv.org/pdf/0911.1763.pdf suggests that we can analyze such inference problem with solving a replicator dynamic as follows:
With this evolutionary dynamics, I investigated two questions according to the paper.
First question: Is the posterior a fixed point?
Answer: No
Second question: Does posterior minimize KL-divergence near the fixed point?
Answer: No
However, in my reading of the paper, at least one of the above two questions should give answer "yes". What am I missing in my understanding of the paper?
Did you post this on Stack Exchange? I suspect there are more people who would be able to answer your question there (although I appreciate that you have taken some of the earlier advice on asking questions on board!)
Do you mean math stack-exchange https://math.stackexchange.com?
I just made posts here https://mathoverflow.net/questions/441129/bayesian-inference-as-replicator-dynamics and here https://math.stackexchange.com/questions/4641999/bayesian-inference-as-replicator-dynamics but nobody has replied so far.
Morgan Rogers (he/him) said:
Did you post this on Stack Exchange? I suspect there are more people who would be able to answer your question there (although I appreciate that you have taken some of the earlier advice on asking questions on board!)
I agree with Morgan, your question is much clearer than the previous ones!
Third Question: Is there a more general formula than the one calculated above?
Answer: Yes, this is a logistic equation
This topic was moved here from #learning: questions > evolutionary game by Matteo Capucci (he/him).
Some potential problems
[1] The result only hold for higher dimensional dynamics.
[2] The result only holds for discrete replicator dynamics.
Peiyuan Zhu said:
Some potential problems
[1] The result only hold for higher dimensional dynamics.
[2] The result only holds for discrete replicator dynamics.
Response to potential problems:
[1] The high dimensional replicator dynamics would experience the same problem of degenerate solution.
[2] The KL-divergence result is indeed for continuous time replicator dynamics.
For
There isn't a rest point on the simplex.
And maybe it's because I didn't understand this proof.
image.png
I think the definition of Lyapunov function here is that a function is decreasing near an equilibrium point. The replicator equation is substituted in etc. But it doesn't say anything about the situation that equilibrium doesn't exist is that it?
If there's no rest point there's no ESS (evolutionarily stable state) so the theorem implies that for no point is the Kullback-Leibler divergence a local Lyapunov function.
So when the paper says " The replicator equation can now be understood as modeling the informational dynamics of the popula- tion distribution, moving in the direction of maximal local increase of potential with respect to the Fisher information, and ultimately con- verging to a minimal potential information state if a stablizing state (ESS) exists in the interior of the state space." it means that only if an ESS exists, but if normally ESS doesn't exist for Bayesian inference, this paper isn't fair to its title by saying "replicator equation as an inference dynamics". Am I correct?
That question is too vague and subjective to answer. Focus on the theorems, not whether it's "fair" to title the paper a certain way.
By "fair" I mean the solution of the replicator is one-to-one to Bayesian posterior. So can I understand the above sentence from the paper as "the rest point of the replicator is the Bayesian posterior obtained by minimizing Lyapunov function if and only if the rest point is an ESS"? So if I want to verify this statement with a numerical example, I would need to find an inference problem that has an ESS first. The paper didn't say anything about when the inference problem has an ESS, so I can only try very arbitrary fitness functions by myself, am I correct?
The theorem says exactly what it says: it says that a state is an interior ESS for the replicator equation if and only iff is a local Lyapunov function.
If you want a fun example of this theorem, pick a replicator equation that has an interior ESS.
I see. Now it makes sense. I tried two examples already but ESS doesn't exist in either case. In 2d interior ESS for sure doesn't exists. Now there are quite a lot of choices for 3d. But the previous example I tried above doesn't seem to have an interior ESS either. There are so many cases that ESS doesn't exists. I'll keep trying. At least from the above example I know that the replicator divides the simplex into 2 by 2 by 2 equals 8 possible regions with varying signs of derivative. The case when an interior ESS should be exactly the case when three lines cross at one point on the simplex. This is a extremely small fraction out of all legit inference problems.
It’d be interesting to see what are the inference conditions that correspond to existence of ESS.
Ok, it looks like I still couldn't find interior ESS
Coin tossing model
Observe H
Two possible coins, p(H|xi)=a,b
dx1/dt=x1(a-ax1-b*x2)
dx2/dt=x2(b-ax1-b*x2)
There’s no interior ESS
Three possible coins, p(H|xi)=a,b,c
dx1/dt=x1(a-ax1-bx2-cx3)
dx2/dt=x2(b-ax1-bx2-cx3)
dx3/dt=x3(c-ax1-bx2-cx3)
There’s no interior ESS
2E58DC6B-90D3-4D25-A556-A7EAE7C6B729.png
So I’m still having trouble seeing this analogy.
New evidence doesn’t depend on prior probability at all, but in fitness landscape it does seem to depend on population state. The analogy doesn’t hold.
“Bayesian inference is a special case, formally, of the discrete replicator dynamic, since the fitness landscape in each coordinate may depend on the en- tire population distribution rather than only on the proportion of the i- type” The fitness landscape in Bayesian inference doesn’t seem to even depend on proportion of the i-type.
Unless the prior itself are the parameters, which isn’t the standard Bayesian setting that he laid out
P(E|Hi) doesn’t depend on P(Hi)