Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: theory: applied category theory

Topic: Consistency in probabilistic logic/inference


view this post on Zulip Rémy Tuyéras (Jun 08 2020 at 17:18):

Hi all, I am opening a new thread regarding the conversion of data and measurements into logical statements. One of the main problems I see is that data will not always provide consistent probabilities, which may give non-consistent conclusions for different measurements. This often forces statisticians to ask whether one has enough data.

The idea of p-value encompasses this aspect, in which we ask whether we can believe in the conclusions we inferred from the data, given that we could have continued to collect even more data and have the risk of changing our conclusions.

I think that if one wants to be able to have a sound (probabilistic/statistical/learning) logic for reasoning about the things one observes in the world, one needs to address data consistency within the logic itself (because logic is about assessing consistency too).

view this post on Zulip Oliver Shetler (Jun 08 2020 at 17:30):

Can you clarify what you mean by

One of the main problems I see is that data will not always provide consistent probabilities, which may give non-consistent conclusions for different measurements.

view this post on Zulip Oliver Shetler (Jun 08 2020 at 17:30):

Maybe some examples?

view this post on Zulip Rémy Tuyéras (Jun 08 2020 at 17:42):

Oliver Shetler said:

Maybe some examples?

Suppose that the data follow a curve tg(t)t \mapsto g(t) from time = 0 to time = 100 and a curve tf(t)t \mapsto f(t) from time = 100 to time = 1000.

If you only collect data from 0 to 100, then your statement will be C1=C_1 = "data looks like gg on average", if you decide to collect more data until time = 1000, then your statement will be C2=C_2 = "data looks like ff on average" (because gg is only defined on a small interval). As a result, you can obtain two conclusions C1C_1 and C2C_2 that may be inconsistent with each other.

view this post on Zulip Rémy Tuyéras (Jun 08 2020 at 17:54):

A more concrete example is when our perception of the data varies -- different beliefs can lead to inconsistencies: see Sleeping Beauty Problem.

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 18:38):

This problem interests me a lot. It seems a pretty good use case for sheaves

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 18:39):

Sheaves allow you to reason about local information. You are asking about translating these measurements into logical statements. As you point out in your example, these measurements can only carry a logical value that has something to do with the underlying structure of the space you are measuring on (time in your example)

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 18:39):

There are two things I'd look into: The work on dynamical systems done by Spivak & co. using temporal logic and the work by Michael Robinson that uses sheaves todo data analysis.

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 18:40):

So, my point is that the internal logic of a sheaf generalizes truth values from "true or false" to "true in this open set of your space". I don't know if this is what you want, but for sure it looks to me as the most appropriate setting to represent measurements as logical statements.

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 18:41):

I'm using sheaves heavily lately, so I'm happy to collaborate on this if you want to. :slight_smile:

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 18:48):

Another important thing: When you have incompatible measurements, you can use sheaf cohomology to get qualitative information about why your measurements are inconsistent with each other. I guess this could be useful as well. :slight_smile:

view this post on Zulip Oliver Shetler (Jun 08 2020 at 18:58):

Rémy Tuyéras said:

Suppose that the data follow a curve tg(t)t \mapsto g(t) from time = 0 to time = 100 and a curve tf(t)t \mapsto f(t) from time = 100 to time = 1000.

If you only collect data from 0 to 100, then your statement will be C1=C_1 = "data looks like gg on average", if you decide to collect more data until time = 1000, then your statement will be C2=C_2 = "data looks like ff on average" (because gg is only defined on a small interval). As a result, you can obtain two conclusions C1C_1 and C2C_2 that may be inconsistent with each other.

Yes, clearly your estimator can change with more data, especially in a time series where the parameter may vary over time. Just as clearly, there is often not enough information in the data to anticipate the change, so your problem can't be to "solve" the fact that data evolves over time. I am still unclear on what the problem is, then. Can you try to state your idea in the form of a problem in need of a solution?

view this post on Zulip Oliver Shetler (Jun 08 2020 at 19:01):

Fabrizio Genovese said:

Sheaves allow you to reason about local information. You are asking about translating these measurements into logical statements. As you point out in your example, these measurements can only carry a logical value that has something to do with the underlying structure of the space you are measuring on (time in your example)

So the goal is to look for ways to make "inconsistent" measurements into logically consistent statements?

view this post on Zulip Oliver Shetler (Jun 08 2020 at 19:01):

Fabrizio Genovese said:

Another important thing: When you have incompatible measurements, you can use sheaf cohomology to get qualitative information about why your measurements are inconsistent with each other. I guess this could be useful as well. :)

This sounds very interesting. Do you have a resource that describes this in more detail?

view this post on Zulip Rémy Tuyéras (Jun 08 2020 at 19:10):

Fabrizio Genovese said:

So, my point is that the internal logic of a sheaf generalizes truth values from "true or false" to "true in this open set of your space". I don't know if this is what you want, but for sure it looks to me as the most appropriate setting to represent measurements as logical statements.

Yes, as far as I uderstand, sheaves & cohomology can be used to detect missing data that can used to complete already existing data in a given sheaf. Any obstruction detected at the sheaf level can tell us about

view this post on Zulip Rémy Tuyéras (Jun 08 2020 at 19:11):

Oliver Shetler said:

Rémy Tuyéras said:

Suppose that the data follow a curve tg(t)t \mapsto g(t) from time = 0 to time = 100 and a curve tf(t)t \mapsto f(t) from time = 100 to time = 1000.

If you only collect data from 0 to 100, then your statement will be C1=C_1 = "data looks like gg on average", if you decide to collect more data until time = 1000, then your statement will be C2=C_2 = "data looks like ff on average" (because gg is only defined on a small interval). As a result, you can obtain two conclusions C1C_1 and C2C_2 that may be inconsistent with each other.

Yes, clearly your estimator can change with more data, especially in a time series where the parameter may vary over time. Just as clearly, there is often not enough information in the data to anticipate the change, so your problem can't be to "solve" the fact that data evolves over time. I am still unclear on what the problem is, then. Can you try to state your idea in the form of a problem in need of a solution?

My point is: how do we incorporate the concept of "being enough (data)" in the logic. Such a concept would push us to be more caucious on the truth values of our logical deductions, and hopefully, allow us to avoid inconsistencies.

view this post on Zulip Rémy Tuyéras (Jun 08 2020 at 19:11):

I feel like statisticians have tried to address this question through test statistics and p-values. My question is: what do we have on our side?

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 19:11):

Did you try by introducing modalities in your logic?

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 19:12):

The modalities reflect directly on the topologies you allow on the base space, which is also interesting in this setting

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 19:20):

Oliver Shetler said:

Fabrizio Genovese said:

Another important thing: When you have incompatible measurements, you can use sheaf cohomology to get qualitative information about why your measurements are inconsistent with each other. I guess this could be useful as well. :slight_smile:

This sounds very interesting. Do you have a resource that describes this in more detail?

It's not interesting. It's plain dope. Sheaves are heavily used in ACT and yet very underrated considering how much they have to offer. An example of what I say can be found here: https://arxiv.org/abs/2005.12798

view this post on Zulip Rémy Tuyéras (Jun 08 2020 at 19:20):

Fabrizio Genovese said:

The modalities reflect directly on the topologies you allow on the base space, which is also interesting in this setting

Yes, agree that modalities go in this direction. It would be nice to have a formalism comparing two modalities, so one can decide which is more appropriate to the context, a bit like how it is done in hypothesis testing, when one compares a null hypothesis H0H_0 to H1H_1.

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 19:23):

Well, the "comparison" could exactly be "comparing the underlying topologies given by the modalities" :slight_smile:

view this post on Zulip Rémy Tuyéras (Jun 08 2020 at 19:28):

Fabrizio Genovese said:

Well, the "comparison" could exactly be "comparing the underlying topologies given by the modalities" :)

That idea sounds interesting. In your case, how do you compare two topologies? could you develop with an example?

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 19:30):

I don't have any real example in mind, but regarding topologies as just distributive lattices there are quite a lot of tools we can throw at them

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 19:30):

If we are thinking about Grothendieck topologies then I guess we could even order them by inclusion. Dunno how much sense this makes tho

view this post on Zulip Rémy Tuyéras (Jun 08 2020 at 19:48):

For finite topologies (i.e. finite sets), we can always use the partitioning induced by the topology on the underlying space.

I would personally take it to the category of partitions, in which I would compare the two partitions with categorical tools. That is what I am getting at in this paper where I essentially take two partitions of a space SS, say p1:SK1p_1:S\to K_1 and p2:SK2p_2:S\to K_2, and I try to compare them through a morphism p1p2p_1 \to p_2 or p1p2p_1 \to p_2.
Of course, you cannot always do that, so one way to do it is to introduce a product term p1×εp2p_1 \times\varepsilon \to p_2 where the extra partition ε\varepsilon is a "pseudo-universal" object measuring the discrepancy between p1p_1 and p2p_2.

view this post on Zulip Rémy Tuyéras (Jun 08 2020 at 19:48):

I am not sure if this can be extended to the infinite case though

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 19:51):

Why use the partitioning and not the topology itself?

view this post on Zulip Rémy Tuyéras (Jun 08 2020 at 19:57):

Just for practical reasoning: I am thinking from a programming viewpoint too.
Note that, in the finite case, you can reduce your topology to a set of disjoint intersections. Then, you can retrieve the unions by taking sequences of partitions F:=p1p2p3pnF:=p_1 \to p_2 \to p_3 \to \dots \to p_n. Each arrow in this sequence FF joins parts of the partitions together to form coarser subdivisions.

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 20:02):

Yeah, I am not sure this holds in general

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 20:02):

Take the real numbers with the usual topology for instance

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 20:06):

Yeah, this does not hold in general I believe. But do we really need this?

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 20:08):

True that I'm more comfortable with algebra than with geometry, but I'd try to do comparisons by looking at the following things:

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 20:09):

The second one is interesting since TT may actually have "bigger" open sets than TT', or the open sets of T,TT, T' may be totally unrelated

view this post on Zulip Fabrizio Genovese (Jun 08 2020 at 20:10):

I'm sure there's plenty of stuff in pointless topology one can check out, btw

view this post on Zulip Oliver Shetler (Jun 08 2020 at 22:47):

Fabrizio Genovese said:

Oliver Shetler said:

Fabrizio Genovese said:

Another important thing: When you have incompatible measurements, you can use sheaf cohomology to get qualitative information about why your measurements are inconsistent with each other. I guess this could be useful as well. :)

This sounds very interesting. Do you have a resource that describes this in more detail?

It's not interesting. It's plain dope. Sheaves are heavily used in ACT and yet very underrated considering how much they have to offer. An example of what I say can be found here: https://arxiv.org/abs/2005.12798

"OPINION DYNAMICS ON DISCOURSE SHEAVES? Whaaaat? Dope really is the only way to describe this paper.

view this post on Zulip Martti Karvonen (Jun 09 2020 at 14:01):

Oliver Shetler said:

Fabrizio Genovese said:

Another important thing: When you have incompatible measurements, you can use sheaf cohomology to get qualitative information about why your measurements are inconsistent with each other. I guess this could be useful as well. :)

This sounds very interesting. Do you have a resource that describes this in more detail?

If the goal is to use sheaf cohomolgy for asking whether different measurements are giving answers which glue together consistently or not, I'd look at https://arxiv.org/abs/1502.03097 and related works.

view this post on Zulip Fabrizio Genovese (Jun 09 2020 at 14:37):

I may be biased here, but I'm never been a fan of the contextuality work. I feel like the underlying ideas are conceptually easy to understand, but at least to me those papers always obscured what the whole point was about

view this post on Zulip Fabrizio Genovese (Jun 09 2020 at 14:38):

Actually, I've watched many lectures and presentations about this line of work

view this post on Zulip Fabrizio Genovese (Jun 09 2020 at 14:38):

And those have been responsible for me constantly feeling too stupid to get sheaf theory.

view this post on Zulip Fabrizio Genovese (Jun 09 2020 at 14:39):

I don't know, I used to think that "if it doesn't click for me after so many times maybe I'm too stupid to get it". Then I watched one talk by Michael Robinson and I got it immediately. This obviously left me with a strong bias towards contextuality. Even if it's a matter of personal experience, I wouldn't suggest anyone to dive into that stuff to understand how cohomology can help. :slight_smile:

view this post on Zulip (=_=) (Jun 09 2020 at 14:53):

Fabrizio Genovese said:

Then I watched one talk by Michael Robinson and I got it immediately.

This DARPA tutorial and the associated videos? Well, clearly he knew how to design the material for his target audience. :smiley:

view this post on Zulip Fabrizio Genovese (Jun 09 2020 at 14:54):

In my specific case, I watched a presentation at NIST in 2018, and it was illuminating.

view this post on Zulip (=_=) (Jun 09 2020 at 14:55):

Well, he's got a series of slides and videos on that tutorial site, which looks promising.

view this post on Zulip Fabrizio Genovese (Jun 09 2020 at 14:55):

Last time I checked some lectures had some audio problems tho

view this post on Zulip (=_=) (Jun 09 2020 at 14:56):

Fair enough. Those are usually relatively minor though. I should learn this material, though, so thanks for the tip.

view this post on Zulip Martti Karvonen (Jun 09 2020 at 15:14):

Fabrizio Genovese said:

I may be biased here, but I'm never been a fan of the contextuality work. I feel like the underlying ideas are conceptually easy to understand, but at least to me those papers always obscured what the whole point was about

I might be biased too, having contributed to some more recent developments in the framework. It's certainly not the source to learn sheaf theory, and in a sense the story is precisely about probabilities failing to be a sheaf: contextuality in this pov is exactly about how you might have a consistent family of probability distributions (i.e. the marginals agree) that fails to admit any gluings at all to a global joint distribution.

view this post on Zulip Fabrizio Genovese (Jun 09 2020 at 15:16):

Yes, now I get it. But imagine seeing the same damn diagram showing up in any Samson/Kohei talk (you know the diagram I'm talking about I guess, it's everywhere xD) and not getting it. Not. Once.

view this post on Zulip Martti Karvonen (Jun 09 2020 at 15:19):

You mean something like the fibre-bundle picture of the PR box that admits no global sections?

view this post on Zulip Fabrizio Genovese (Jun 09 2020 at 15:20):

Yes, that one xD

view this post on Zulip Fabrizio Genovese (Jun 09 2020 at 15:21):

Every time it was like "No, I swear this time I have to understand it. It's almost criminal that I don't at this point" ...5 minutes later "WTFFFFFF"

view this post on Zulip Martti Karvonen (Jun 09 2020 at 17:30):

Here's one big-picture pov on those pictures (not written out in those papers afaik): you have a simplicial complex Σ\Sigma that gives you measurements and specifies which of them are jointly possible. If you collect data over those measurements, and only remember whether particular (joint) outcomes were possible or not, you can view this as another simplicial complex Γ\Gamma, which comes with a fibration to Σ\Sigma - faces lying over a given face of Σ\Sigma are the possible joint outcomes over that joint measurement. Now, such a thing is strongly contextual if $$\Gamma\to\Sigma$ has no sections, i.e. no simplicial maps ΣΓ\Sigma\to\Gamma such that the composite is idΣid_\Sigma. The picture in question is an example of this when Σ={{ai,bi}i=1,2}\Sigma=\{\{a_i,b_i\}|i=1,2\} and Γ\Gamma lists the specific possible outcomes of the PR-box.

view this post on Zulip Martti Karvonen (Jun 09 2020 at 17:31):

I mean, they do talk about the fibre-bundle picture, but afaik the published works don't define fibrations of simplicial complexes etc.

view this post on Zulip Martti Karvonen (Jun 09 2020 at 17:46):

And concretely, that says that there is no assignment of values to each variable in a way that agrees with the predictions of the observed data (i.e. is deemed a possible outcome for every joint measurement).