Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: theory: categorical probability

Topic: "quasi-knowledge" of an agent as an R-algebra and convex opt


view this post on Zulip Duyal Yolcu (Aug 09 2022 at 15:21):

Hi, I have been working on an algebraic description of an agent's "states of knowledge" about an environment that works for classical, pure-quantum, and mixed-quantum situations. These turn out to be a convex subset of an ordered (associative, commutative) $\mathbb{R}$-algebra, constructed with a Grothendieck group construction from the set of joint probability matrices modulo an appropriate equivalence relation. The main point is that this should allow using convex optimization and duality to find optimal strategies/lower bounds for an agent to achieve something (e.g. learn something about the environment), reproducing work of Barnum-Saks-Szegedy and the quantum adversary bound-universal query algorithm duality in a way that immediately applies to classical/mixed-quantum situations. Leaving mathematical rigour a little bit, I can also describe "differential equations of knowledge", solved by formal power series (e.g. a Poissonian process in which some source of information emits some experimental data with a certain probability per unit of time). See sections 6/7 in the writeup - I haven't written everything I want to, but what is there should be consistent.

Is this algebra old news here, or do you have some other feedback? I am fairly hopeful that there is something correct in it (as it reproduces results from quantum complexity theory), and it's not old news to quantum complexity theorists (as I am in touch with one).

view this post on Zulip John Baez (Aug 09 2022 at 18:32):

Why do you use convex subsets of an ordered commutative R\mathbb{R}-algebra? I'm familiar with a number of approaches to describing states of a system (which can be seen as states of knowledge) using abstract convex spaces, or convex cones.

view this post on Zulip Duyal Yolcu (Aug 09 2022 at 23:00):

I don't think I understand your question, on one level, the answer to "why" is "because it comes out of the formalism"...

I can say an _advantage_ of getting a convex subset of an R-algebra (which is also an R-vector space, of course) is that one can hope to use convex optimization/duality to some effect (as, indeed, was done in the context of quantum query algorithms by Barnum-Saks-Szegedy and others). If one naively defined addition as addition of probability entries (instead of direct sums, as I do), that doesn't seem to work.

Can you give some references?

view this post on Zulip Duyal Yolcu (Aug 09 2022 at 23:06):

Do some of the approaches you have in mind include a notion of modding out equivalent states of the system (i.e. states that the agent could trivially transform into each other)?

view this post on Zulip John Baez (Aug 09 2022 at 23:18):

I don't think I understand your question, on one level, the answer to "why" is "because it comes out of the formalism"...

It sounds like maybe you're saying "read the paper and you'll figure it out".

view this post on Zulip John Baez (Aug 09 2022 at 23:39):

Can you give some references?

Sure; here are some. In work on "generalized probabilistic theories" people use either convex subsets of vector spaces or convex cones in vector spaces. (The second reference here looks technical, but see Section 2, which is a review of the convex cone approach).

view this post on Zulip John Baez (Aug 09 2022 at 23:40):

In my work with @Owen Lynch on thermodynamics we use convex spaces, defined abstractly and not necessarily subsets of vector spaces.

view this post on Zulip John Baez (Aug 09 2022 at 23:41):

In all this work you can think of points in the relevant convex sets as "states of knowledge", though this "epistemic" approach is not highlighted.

view this post on Zulip John Baez (Aug 09 2022 at 23:43):

Do some of the approaches you have in mind include a notion of modding out equivalent states of the system (i.e. states that the agent could trivially transform into each other)?

I haven't seen people talk about that.

view this post on Zulip Duyal Yolcu (Aug 09 2022 at 23:56):

Thanks for the reply!

If the question is "how", I can say that addition is a direct sum (of the joint probability matrices of agent's memory+environment), multiplication is a tensor product (for each environmental state), and multiplication by a nonnegative real number corresponds to changing the norm (l1 norm for classical/squared-l2 for quantum) of the state. Then the state is "convex" in the sense that, if 0<=p<=1 and the agent is able to attain X and Y using some algorithm, it should also be able to attain pX+(1-p)Y (by choosing to apply whatever algorithm leads to X with probability p, whatever algorithm leads to Y with 1-p, and storing whether it chose X or Y in another register).

Conversely, if states of knowledge X and Y both allow solving some problem (e.g. calculating a function with a maximal error probability), pX+(1-p)Y should also allow that because it's defined as a direct sum. But if we had defined addition as elementwise sums, it wouldn't be true: If X_p was a matrix in which the memory encoded the environment in some permutation, it would allow the agent to determine the environmental state completely - but mixing all permutations elementwise corresponds to a state of complete ignorance.

view this post on Zulip Duyal Yolcu (Aug 10 2022 at 00:07):

John Baez said:

Do some of the approaches you have in mind include a notion of modding out equivalent states of the system (i.e. states that the agent could trivially transform into each other)?

I haven't seen people talk about that.

I think that's the main point of what I do. For example, in the writeup, the whole SquantS_\mathrm{quant} thing should actually be nothing but the set of (unnormalized) density operators, i.e. positive semidefinite matrices, on E: The set of wavefunctions on a bipartite system, modulo local transformations on one part, is the set of density matrices. A direct sum of wavefunctions corresponds to an elementwise sum of density matrices, and it turns out that the product notion on the "states of knowledge" corresponds to a Hadamard (elementwise) product of the density matrices. Now that's basically what people in quantum query complexity work with.

Your second reference refers to this paper that talks about query complexity in "generalised probabilistic theories" and was new to me, but I don't see mentions of the adversary bound/convex optimization there and it doesn't obviously look like what I do...

view this post on Zulip John Baez (Aug 10 2022 at 00:09):

I'll have to think about this - I don't understand it at all.

Btw, you need to use two dollar signs, instead of the usual one, to make TeX work here.

view this post on Zulip Duyal Yolcu (Aug 10 2022 at 23:09):

Duyal Yolcu said:
(I updated the text, it now develops a concrete example in section 6 where I write down some representative matrices considering an agent observing biased coin flips, also fixed a typo and missing matrix transposes).

view this post on Zulip Duyal Yolcu (Sep 16 2022 at 16:03):

In the meantime, I realized that, in the classical case, the structure I am interested in is basically the following commutative semiring[^1]:

  1. Elements are measures (i.e. probability distributions minus the normalization constraint) over probability distributions over the environment's states. So the agent has some Bayesian beliefs about the environment's state (a probability distribution over EE - a member of the probability simplex ΔE\Delta^E) - and which beliefs it currently has is itself random under some probability distribution.
  2. Addition is addition of these measures (and convex mixtures correspond to probabilistic combinations),
  3. Multiplication corresponds to Bayesian updates on new information: If the agent's current belief is a probability vector over EE p1ΔE\vec{p_1}\in \Delta^E, and it observes an event with probability distribution p2\vec{p_2}, the posterior probability it assigns to environmental state eEe\in E should be

(p1p2)e:=(p1)e(p2)eeE(p1)e(p2)e(\vec{p_1}*\vec{p_2})_e:=\frac{(\vec{p_1})_e(\vec{p_2})_e}{\sum_{e'\in E} (\vec{p_1})_{e'}(\vec{p_2})_{e'}};

i.e. the result of multiplying the probabilities and renormalizing.

As the agent's beliefs and the observed events are themselves randomized (i.e. probability measures), we take the pushforward measure of these probability measures under the operation (i.e. calculate the probability distribution over the resulting probability estimates).

We can show that this multiplication (on probability measures) is bilinear, associative and commutative.

By a Grothendieck group construction, we should be able to get from the semiring to a (commutative) ring (concretely, it should mean that probability vectors can have negative probability densities). Then we can do that formal-differential-equations-of-knowledge stuff discussed in section 8 in that writeup; and I claim that setting things up in the way done in the writeup leads to analogous structures for pure-quantum/mixed-quantum situations.

Now my questions are:

  1. Is this formulation clearer to the people reading here?
  2. Is someone aware of a discussion of this object?

[^1]: (technically, the construction given in the writeup currently yields only distributions with finite support)