Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: theory: applied category theory

Topic: Consistency in probabilistic logic/inference

Rémy Tuyéras (Jun 08 2020 at 17:18):

Hi all, I am opening a new thread regarding the conversion of data and measurements into logical statements. One of the main problems I see is that data will not always provide consistent probabilities, which may give non-consistent conclusions for different measurements. This often forces statisticians to ask whether one has enough data.

The idea of p-value encompasses this aspect, in which we ask whether we can believe in the conclusions we inferred from the data, given that we could have continued to collect even more data and have the risk of changing our conclusions.

I think that if one wants to be able to have a sound (probabilistic/statistical/learning) logic for reasoning about the things one observes in the world, one needs to address data consistency within the logic itself (because logic is about assessing consistency too).

Oliver Shetler (Jun 08 2020 at 17:30):

Can you clarify what you mean by

One of the main problems I see is that data will not always provide consistent probabilities, which may give non-consistent conclusions for different measurements.

Oliver Shetler (Jun 08 2020 at 17:30):

Maybe some examples?

Rémy Tuyéras (Jun 08 2020 at 17:42):

Oliver Shetler said:

Maybe some examples?

Suppose that the data follow a curve $t \mapsto g(t)$ from time = 0 to time = 100 and a curve $t \mapsto f(t)$ from time = 100 to time = 1000.

If you only collect data from 0 to 100, then your statement will be $C_1 =$ "data looks like $g$ on average", if you decide to collect more data until time = 1000, then your statement will be $C_2 =$ "data looks like $f$ on average" (because $g$ is only defined on a small interval). As a result, you can obtain two conclusions $C_1$ and $C_2$ that may be inconsistent with each other.

Rémy Tuyéras (Jun 08 2020 at 17:54):

A more concrete example is when our perception of the data varies -- different beliefs can lead to inconsistencies: see Sleeping Beauty Problem.

Fabrizio Genovese (Jun 08 2020 at 18:38):

This problem interests me a lot. It seems a pretty good use case for sheaves

Fabrizio Genovese (Jun 08 2020 at 18:39):

Sheaves allow you to reason about local information. You are asking about translating these measurements into logical statements. As you point out in your example, these measurements can only carry a logical value that has something to do with the underlying structure of the space you are measuring on (time in your example)

Fabrizio Genovese (Jun 08 2020 at 18:39):

There are two things I'd look into: The work on dynamical systems done by Spivak & co. using temporal logic and the work by Michael Robinson that uses sheaves todo data analysis.

Fabrizio Genovese (Jun 08 2020 at 18:40):

So, my point is that the internal logic of a sheaf generalizes truth values from "true or false" to "true in this open set of your space". I don't know if this is what you want, but for sure it looks to me as the most appropriate setting to represent measurements as logical statements.

Fabrizio Genovese (Jun 08 2020 at 18:41):

I'm using sheaves heavily lately, so I'm happy to collaborate on this if you want to. :slight_smile:

Fabrizio Genovese (Jun 08 2020 at 18:48):

Another important thing: When you have incompatible measurements, you can use sheaf cohomology to get qualitative information about why your measurements are inconsistent with each other. I guess this could be useful as well. :slight_smile:

Oliver Shetler (Jun 08 2020 at 18:58):

Rémy Tuyéras said:

Suppose that the data follow a curve $t \mapsto g(t)$ from time = 0 to time = 100 and a curve $t \mapsto f(t)$ from time = 100 to time = 1000.

If you only collect data from 0 to 100, then your statement will be $C_1 =$ "data looks like $g$ on average", if you decide to collect more data until time = 1000, then your statement will be $C_2 =$ "data looks like $f$ on average" (because $g$ is only defined on a small interval). As a result, you can obtain two conclusions $C_1$ and $C_2$ that may be inconsistent with each other.

Yes, clearly your estimator can change with more data, especially in a time series where the parameter may vary over time. Just as clearly, there is often not enough information in the data to anticipate the change, so your problem can't be to "solve" the fact that data evolves over time. I am still unclear on what the problem is, then. Can you try to state your idea in the form of a problem in need of a solution?

Oliver Shetler (Jun 08 2020 at 19:01):

Fabrizio Genovese said:

Sheaves allow you to reason about local information. You are asking about translating these measurements into logical statements. As you point out in your example, these measurements can only carry a logical value that has something to do with the underlying structure of the space you are measuring on (time in your example)

So the goal is to look for ways to make "inconsistent" measurements into logically consistent statements?

Oliver Shetler (Jun 08 2020 at 19:01):

Fabrizio Genovese said:

Another important thing: When you have incompatible measurements, you can use sheaf cohomology to get qualitative information about why your measurements are inconsistent with each other. I guess this could be useful as well. :)

This sounds very interesting. Do you have a resource that describes this in more detail?

Rémy Tuyéras (Jun 08 2020 at 19:10):

Fabrizio Genovese said:

So, my point is that the internal logic of a sheaf generalizes truth values from "true or false" to "true in this open set of your space". I don't know if this is what you want, but for sure it looks to me as the most appropriate setting to represent measurements as logical statements.

Yes, as far as I uderstand, sheaves & cohomology can be used to detect missing data that can used to complete already existing data in a given sheaf. Any obstruction detected at the sheaf level can tell us about

what we did wrong
what we are missing relative to what we already have
I find that detection aspect quite cool, and I have used it extensively in my work. Still, I feel like the sheaf language is not addressing the question of what we cannot see.

Rémy Tuyéras (Jun 08 2020 at 19:11):

Oliver Shetler said:

Rémy Tuyéras said:

Suppose that the data follow a curve $t \mapsto g(t)$ from time = 0 to time = 100 and a curve $t \mapsto f(t)$ from time = 100 to time = 1000.

If you only collect data from 0 to 100, then your statement will be $C_1 =$ "data looks like $g$ on average", if you decide to collect more data until time = 1000, then your statement will be $C_2 =$ "data looks like $f$ on average" (because $g$ is only defined on a small interval). As a result, you can obtain two conclusions $C_1$ and $C_2$ that may be inconsistent with each other.

Yes, clearly your estimator can change with more data, especially in a time series where the parameter may vary over time. Just as clearly, there is often not enough information in the data to anticipate the change, so your problem can't be to "solve" the fact that data evolves over time. I am still unclear on what the problem is, then. Can you try to state your idea in the form of a problem in need of a solution?

My point is: how do we incorporate the concept of "being enough (data)" in the logic. Such a concept would push us to be more caucious on the truth values of our logical deductions, and hopefully, allow us to avoid inconsistencies.

Rémy Tuyéras (Jun 08 2020 at 19:11):

I feel like statisticians have tried to address this question through test statistics and p-values. My question is: what do we have on our side?

Fabrizio Genovese (Jun 08 2020 at 19:11):

Did you try by introducing modalities in your logic?

Fabrizio Genovese (Jun 08 2020 at 19:12):

The modalities reflect directly on the topologies you allow on the base space, which is also interesting in this setting

Fabrizio Genovese (Jun 08 2020 at 19:20):

Oliver Shetler said:

Fabrizio Genovese said:

Another important thing: When you have incompatible measurements, you can use sheaf cohomology to get qualitative information about why your measurements are inconsistent with each other. I guess this could be useful as well. :slight_smile:

This sounds very interesting. Do you have a resource that describes this in more detail?

It's not interesting. It's plain dope. Sheaves are heavily used in ACT and yet very underrated considering how much they have to offer. An example of what I say can be found here: https://arxiv.org/abs/2005.12798

Rémy Tuyéras (Jun 08 2020 at 19:20):

Fabrizio Genovese said:

The modalities reflect directly on the topologies you allow on the base space, which is also interesting in this setting

Yes, agree that modalities go in this direction. It would be nice to have a formalism comparing two modalities, so one can decide which is more appropriate to the context, a bit like how it is done in hypothesis testing, when one compares a null hypothesis $H_0$ to $H_1$ .

Fabrizio Genovese (Jun 08 2020 at 19:23):

Well, the "comparison" could exactly be "comparing the underlying topologies given by the modalities" :slight_smile:

Rémy Tuyéras (Jun 08 2020 at 19:28):

Fabrizio Genovese said:

Well, the "comparison" could exactly be "comparing the underlying topologies given by the modalities" :)

That idea sounds interesting. In your case, how do you compare two topologies? could you develop with an example?

Fabrizio Genovese (Jun 08 2020 at 19:30):

I don't have any real example in mind, but regarding topologies as just distributive lattices there are quite a lot of tools we can throw at them

Fabrizio Genovese (Jun 08 2020 at 19:30):

If we are thinking about Grothendieck topologies then I guess we could even order them by inclusion. Dunno how much sense this makes tho

Rémy Tuyéras (Jun 08 2020 at 19:48):

For finite topologies (i.e. finite sets), we can always use the partitioning induced by the topology on the underlying space.

I would personally take it to the category of partitions, in which I would compare the two partitions with categorical tools. That is what I am getting at in this paper where I essentially take two partitions of a space $S$ , say $p_1:S\to K_1$ and $p_2:S\to K_2$ , and I try to compare them through a morphism $p_1 \to p_2$ or $p_1 \to p_2$ .
Of course, you cannot always do that, so one way to do it is to introduce a product term $p_1 \times\varepsilon \to p_2$ where the extra partition $\varepsilon$ is a "pseudo-universal" object measuring the discrepancy between $p_1$ and $p_2$ .

Rémy Tuyéras (Jun 08 2020 at 19:48):

I am not sure if this can be extended to the infinite case though

Fabrizio Genovese (Jun 08 2020 at 19:51):

Why use the partitioning and not the topology itself?

Rémy Tuyéras (Jun 08 2020 at 19:57):

Just for practical reasoning: I am thinking from a programming viewpoint too.
Note that, in the finite case, you can reduce your topology to a set of disjoint intersections. Then, you can retrieve the unions by taking sequences of partitions $F:=p_1 \to p_2 \to p_3 \to \dots \to p_n$ . Each arrow in this sequence $F$ joins parts of the partitions together to form coarser subdivisions.

Fabrizio Genovese (Jun 08 2020 at 20:02):

Yeah, I am not sure this holds in general

Fabrizio Genovese (Jun 08 2020 at 20:02):

Take the real numbers with the usual topology for instance

Fabrizio Genovese (Jun 08 2020 at 20:06):

Yeah, this does not hold in general I believe. But do we really need this?

Fabrizio Genovese (Jun 08 2020 at 20:08):

True that I'm more comfortable with algebra than with geometry, but I'd try to do comparisons by looking at the following things:

$T \leq T'$ iff $T \subseteq T'$ , where we see $T,T'$ as elements of $\mathcal{P}\mathcal{P}S$ for some base set $S$ .
$T \leq T'$ iff $T$ is a sublattice of $T'$

Fabrizio Genovese (Jun 08 2020 at 20:09):

The second one is interesting since $T$ may actually have "bigger" open sets than $T'$ , or the open sets of $T, T'$ may be totally unrelated

Fabrizio Genovese (Jun 08 2020 at 20:10):

I'm sure there's plenty of stuff in pointless topology one can check out, btw

Oliver Shetler (Jun 08 2020 at 22:47):

Fabrizio Genovese said:

Oliver Shetler said:

Fabrizio Genovese said:

Another important thing: When you have incompatible measurements, you can use sheaf cohomology to get qualitative information about why your measurements are inconsistent with each other. I guess this could be useful as well. :)

This sounds very interesting. Do you have a resource that describes this in more detail?

It's not interesting. It's plain dope. Sheaves are heavily used in ACT and yet very underrated considering how much they have to offer. An example of what I say can be found here: https://arxiv.org/abs/2005.12798

"OPINION DYNAMICS ON DISCOURSE SHEAVES? Whaaaat? Dope really is the only way to describe this paper.

Martti Karvonen (Jun 09 2020 at 14:01):

Oliver Shetler said:

Fabrizio Genovese said:

Another important thing: When you have incompatible measurements, you can use sheaf cohomology to get qualitative information about why your measurements are inconsistent with each other. I guess this could be useful as well. :)

This sounds very interesting. Do you have a resource that describes this in more detail?

If the goal is to use sheaf cohomolgy for asking whether different measurements are giving answers which glue together consistently or not, I'd look at https://arxiv.org/abs/1502.03097 and related works.

Fabrizio Genovese (Jun 09 2020 at 14:37):

I may be biased here, but I'm never been a fan of the contextuality work. I feel like the underlying ideas are conceptually easy to understand, but at least to me those papers always obscured what the whole point was about

Fabrizio Genovese (Jun 09 2020 at 14:38):

Actually, I've watched many lectures and presentations about this line of work

Fabrizio Genovese (Jun 09 2020 at 14:38):

And those have been responsible for me constantly feeling too stupid to get sheaf theory.

Fabrizio Genovese (Jun 09 2020 at 14:39):

I don't know, I used to think that "if it doesn't click for me after so many times maybe I'm too stupid to get it". Then I watched one talk by Michael Robinson and I got it immediately. This obviously left me with a strong bias towards contextuality. Even if it's a matter of personal experience, I wouldn't suggest anyone to dive into that stuff to understand how cohomology can help. :slight_smile:

(=_=) (Jun 09 2020 at 14:53):

Fabrizio Genovese said:

Then I watched one talk by Michael Robinson and I got it immediately.

This DARPA tutorial and the associated videos? Well, clearly he knew how to design the material for his target audience. :smiley:

Fabrizio Genovese (Jun 09 2020 at 14:54):

In my specific case, I watched a presentation at NIST in 2018, and it was illuminating.

(=_=) (Jun 09 2020 at 14:55):

Well, he's got a series of slides and videos on that tutorial site, which looks promising.

Fabrizio Genovese (Jun 09 2020 at 14:55):

Last time I checked some lectures had some audio problems tho

(=_=) (Jun 09 2020 at 14:56):

Fair enough. Those are usually relatively minor though. I should learn this material, though, so thanks for the tip.

Martti Karvonen (Jun 09 2020 at 15:14):

Fabrizio Genovese said:

I may be biased here, but I'm never been a fan of the contextuality work. I feel like the underlying ideas are conceptually easy to understand, but at least to me those papers always obscured what the whole point was about

I might be biased too, having contributed to some more recent developments in the framework. It's certainly not the source to learn sheaf theory, and in a sense the story is precisely about probabilities failing to be a sheaf: contextuality in this pov is exactly about how you might have a consistent family of probability distributions (i.e. the marginals agree) that fails to admit any gluings at all to a global joint distribution.

Fabrizio Genovese (Jun 09 2020 at 15:16):

Yes, now I get it. But imagine seeing the same damn diagram showing up in any Samson/Kohei talk (you know the diagram I'm talking about I guess, it's everywhere xD) and not getting it. Not. Once.

Martti Karvonen (Jun 09 2020 at 15:19):

You mean something like the fibre-bundle picture of the PR box that admits no global sections?

Fabrizio Genovese (Jun 09 2020 at 15:20):

Yes, that one xD

Fabrizio Genovese (Jun 09 2020 at 15:21):

Every time it was like "No, I swear this time I have to understand it. It's almost criminal that I don't at this point" ...5 minutes later "WTFFFFFF"

Martti Karvonen (Jun 09 2020 at 17:30):

Here's one big-picture pov on those pictures (not written out in those papers afaik): you have a simplicial complex $\Sigma$ that gives you measurements and specifies which of them are jointly possible. If you collect data over those measurements, and only remember whether particular (joint) outcomes were possible or not, you can view this as another simplicial complex $\Gamma$ , which comes with a fibration to $\Sigma$ - faces lying over a given face of $\Sigma$ are the possible joint outcomes over that joint measurement. Now, such a thing is strongly contextual if $$\Gamma\to\Sigma$ has no sections, i.e. no simplicial maps $\Sigma\to\Gamma$ such that the composite is $id_\Sigma$ . The picture in question is an example of this when $\Sigma=\{\{a_i,b_i\}|i=1,2\}$ and $\Gamma$ lists the specific possible outcomes of the PR-box.

Martti Karvonen (Jun 09 2020 at 17:31):

I mean, they do talk about the fibre-bundle picture, but afaik the published works don't define fibrations of simplicial complexes etc.

Martti Karvonen (Jun 09 2020 at 17:46):

And concretely, that says that there is no assignment of values to each variable in a way that agrees with the predictions of the observed data (i.e. is deemed a possible outcome for every joint measurement).