Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: learning: questions

Topic: Categorifying the median or the average

Peva Blanchard (Feb 28 2024 at 23:32):

Hi,

I have questions regarding ways to "aggregate things". We do have well-known notions of statistics, like the "average" or the "median" of a collection of numbers. It is also well-known that these statistics can be reformulated as optimization problems, e.g.:

$\textrm{average}(S) = \argmin_z \sum_{x \in S} |x - z|^2$

$\textrm{median}(S) = \argmin_z \sum_{x \in S} |x - z|$

It is also well-known that limits/co-limits are some kind of optimization problems. So it seems natural to ask if one can reformulate the average or median (or similar) of a set of objects in category-theoretical terms.

I thought it would be easy to google my way through, but it turns out not to be so easy to find.

Are there known references that approach that question?

ps: I am thinking about something as follows. Let $X$ be a $V$ -enriched category (à la Lawvere metric space), and $S$ a finite subset of objects of $X$ . For any object $z$ of $X$ , we can compute the "total cost" $\phi_z$ in $V$

$\phi_z = \bigotimes_{x \in S} X(z,x)$

with $X(z,x)$ the hom-object in $V$ . Then we can look for the terminal object (minimizer) of the full subcat spanned by the $\phi_z$ , and check how it can be realized in $X$ , and so on. Or something like that.

pps: The closest paper I found was a paper by Matilde Marcolli on "Pareto categories", where she provides a reformulation of Pareto optimization in the context of categories of resources.

Peva Blanchard (Feb 28 2024 at 23:32):

ppps: It's also quite funny that the "median" may possibly be easier to reformulate than the "average". For the average we would need to give meaning to quadratic expressions $v^2$ for objects $v$ in $V$ .

Oscar Cunningham (Feb 29 2024 at 22:32):

A quadratic expression might use both $X(z.x)$ and $X(x,z)$ in a nicely symmetric way.

Paolo Perrone (Mar 01 2024 at 10:50):

Exactly. I've been wondering this myself:
We would like to have something along these lines:
$\sum_{x} d(z,x) \cdot d(x,z)$
which seems very similar to this:
$\coprod_x C(z,x) \times C(x,z) .$
Now in enriched category theory we can replace the Cartesian product $\times$ with a more general monoidal structure $\otimes$ . However, as far as I know, we cannot replace the coproduct by a more general other monoidal structure, over which $\otimes$ distributes.
(And if we enrich in real numbers, the coproduct is either "sup" or "inf", not the usual sum.)
So what I think is going on here is a further generalization of the idea of enriched category.

I've been thinking for a while about this, without ever managing to figure it out. If anyone has concrete ideas, I'd love to hear about them!

Paolo Perrone (Mar 01 2024 at 10:56):

For context, this would not just be for averages, but in general to talk about (usual) matrix multiplication.

Ralph Sarkis (Mar 01 2024 at 13:26):

Can all generalized means be formulated as optimization problems? (in particular, can the geometric mean?)

Peva Blanchard (Mar 01 2024 at 13:31):

Ralph Sarkis said:

Can all generalized means be formulated as optimization problems? (in particular, can the geometric mean?)

Yes, there is a summary table on Wikipedia.

Peva Blanchard (Mar 01 2024 at 13:47):

Paolo Perrone said:

Exactly. I've been wondering this myself:
We would like to have something along these lines:
$\sum_{x} d(z,x) \cdot d(x,z)$
which seems very similar to this:
$\coprod_x C(z,x) \times C(x,z) .$
Now in enriched category theory we can replace the Cartesian product $\times$ with a more general monoidal structure $\otimes$ . However, as far as I know, we cannot replace the coproduct by a more general other monoidal structure, over which $\otimes$ distributes.
(And if we enrich in real numbers, the coproduct is either "sup" or "inf", not the usual sum.)
So what I think is going on here is a further generalization of the idea of enriched category.

I've been thinking for a while about this, without ever managing to figure it out. If anyone has concrete ideas, I'd love to hear about them!

This looks very tempting to play with indeed! I guess we could just assume that $V$ has two compatible monoidal structures, like $(Vect, \otimes, \oplus)$ . Aren't there papers about $Vect$ -enriched categories?

We can also think about matrices as profunctors, and composition is given by the coend

$C(x,z) = \int^y A(x,y) \otimes B(y,z)$

I'm sure you already know that. But then why the "coend" picture doesn't fit the bill for you?

James Deikun (Mar 01 2024 at 17:00):

I think the "median" given in the original question is just the product in a Lawvere metric space actually.

James Deikun (Mar 01 2024 at 17:08):

Furthermore, it's actually a "medoid" and depending which Lawvere metric space structure you put on the sample space it captures a lot of the different generalized means.

Paolo Perrone (Mar 01 2024 at 19:35):

Peva Blanchard said:

Paolo Perrone said:

Exactly. I've been wondering this myself:
We would like to have something along these lines:
$\sum_{x} d(z,x) \cdot d(x,z)$
which seems very similar to this:
$\coprod_x C(z,x) \times C(x,z) .$
Now in enriched category theory we can replace the Cartesian product $\times$ with a more general monoidal structure $\otimes$. However, as far as I know, we cannot replace the coproduct by a more general other monoidal structure, over which $\otimes$ distributes.
(And if we enrich in real numbers, the coproduct is either "sup" or "inf", not the usual sum.)
So what I think is going on here is a further generalization of the idea of enriched category.

I've been thinking for a while about this, without ever managing to figure it out. If anyone has concrete ideas, I'd love to hear about them!

This looks very tempting to play with indeed! I guess we could just assume that $V$ has two compatible monoidal structures, like $(Vect, \otimes, \oplus)$. Aren't there papers about $Vect$-enriched categories?

We can also think about matrices as profunctors, and composition is given by the coend

$C(x,z) = \int^y A(x,y) \otimes B(y,z)$

I'm sure you already know that. But then why the "coend" picture doesn't fit the bill for you?

In nice situations, a coend is a coequalizer of a coproduct. I guess one can replace the "coproduct" part and keep the equalizer? I don't know.

I can't come up with a consistent definition of "generalized enriched category" in that context though.

Peva Blanchard (Mar 03 2024 at 10:47):

James Deikun said:

I think the "median" given in the original question is just the product in a Lawvere metric space actually.

Mmh, I'm not sure about that, because it seems to me that the product in a Lawvere metric space is rather ill-behaved.

Let $V = ([0, \infty], \geq, +, 0)$ be Lawvere's Cost category. Let $W : S \rightarrow V$ be a weighting over $S$ (seen as a discrete category), and $l$ be the corresponding weighted product of objects of $S$ (assuming it exists). Then, unfolding the definition of weighted limit, we have for every object $s$ in $S$ , and every other object $z$ in $X$

$X(l, s) \le W s\\ X(z, l) = \sup_{s \in S} \{ X(z,s) - W s \}$

where the minus sign stands for truncated subtraction. In particular, if the weighting assigns $0$ to every object of $S$ , then the limit $l$ is at distance $0$ from every object in $S$ , which, most of the time, is possible only when $S$ is a singleton.

Taking co-products will not help much either as it will just swap $\sup$ for $\inf$ . It seems to me that the median/medoid/mean/etc. cannot be expressed as a weighted limits/co-limits of this kind.

James Deikun (Mar 03 2024 at 12:41):

Ah, you're right, I guess it's not quite a limit. Actually you could view $\bigotimes_{s \in S} X(x,s)$ as the space of cones with vertex $x$ and here instead of the one terminal cone you're looking for an object $x$ so that every cone over $S$ is represented by some cone with apex $x$ instead of a particular one. So it's a particular kind of multi-limit I guess?

Peva Blanchard (Mar 03 2024 at 16:38):

Well, I tried to look at some concrete examples, when we take only two objects, say $a$ and $b$ .

With the enriching category $V = (Set, \times, 1)$ . We are looking at the full sub-category $\Gamma$ of $Set$ spanned by the

$\phi z = X(a, z) \times X(b, z)$

Now, if $X$ has a terminal object $1$ , then

$\phi 1 = X(a, 1) \times X(b,1) \simeq 1$

which is terminal in $Set$ , and thus terminal in $\Gamma$ as well. This implies that the "median" of $a$ and $b$ in $X$ is just the terminal object of $X$ .

Peva Blanchard (Mar 03 2024 at 16:58):

This argument, however, does not apply if we choose the set-co-product as monoidal product instead, i.e., $V = (Set, +, 0)$ . The full sub-cat $\Gamma$ is spanned by

$\phi z = X(a, z) + X(b,z)$

for all $z$ in $X$ . If $X$ has a terminal object $1$ , we have

$\phi 1 = X(a, 1) + X(b, 1) \simeq 1 + 1$

which is not terminal in $Set$ .

If we assume that $\phi m$ is terminal in $\Gamma$ for some $m$ in $X$ , it means we have, for all $z$ , a unique function

$X(a,z) + X(b,z) \rightarrow X(a,m) + X(b,m)$

I think this implies that $X(a,m) + X(b,m) = 1$ , which in turn implies

$X(a,m) = 1\\ X(b,m) = 0$

$X(a,m) = 0\\ X(b,m) = 1$

Either way, it does not look very promising for $m$ .

Jean-Baptiste Vienney (Mar 03 2024 at 21:10):

I don't know if it's useful but I'm thinking to this: if you have a category enriched over $\mathbb{Q}_{\ge 0}$ -modules, then you can average a finite set of morphisms $f_1,...,f_n:A \rightarrow B$ by taking $\frac{1}{n}(f_1+...+f_n)$ .

Probably even less useful: I'm saying this because it happened that one day I used a notion of "probabilistic permutation" (something that I called a "blender") in a symmetric monoidal $\mathbb{Q}_{\ge 0}$ -linear category that I defined as a natural transformation $A^{\otimes n} \rightarrow A^{\otimes n}$ of the form $\frac{1}{|M|}\underset{\sigma \in M}{\sum}\sigma$ for some multiset $M$ of permutative natural transformations $A^{\otimes n} \rightarrow A^{\otimes n}$ .

Jean-Baptiste Vienney (Mar 03 2024 at 21:19):

This all I know about probability + categories enriched over the category of (semi) vector spaces over a (semi) field :sweat_smile:.

Jean-Baptiste Vienney (Mar 03 2024 at 21:26):

By the way, why do you want to average objects? Isn't it more natural to average morphisms which can be interpreted as processes?

Jean-Baptiste Vienney (Mar 03 2024 at 21:27):

When you take a limit, you take the limit of a cone, so it involves a set of morphisms rather than just a set of objects. I think the morphisms are the context which is creating the combinatorics of the situation thus allowing an interpretation in terms of optimization.

Jean-Baptiste Vienney (Mar 03 2024 at 21:29):

(just some random thoughts)

Jean-Baptiste Vienney (Mar 03 2024 at 21:36):

Generalization (I think, maybe loosely) of the first sentence I wrote by replacing the addition by a tensor product and allowing the domains and codomains to be different: If you have a monoidal category $\mathbb{Q}_{\ge 0}$ -linear category (ie. enriched over $\mathbb{Q}_{\ge 0}$ -modules/semi vector spaces), you could say that the average of the $f_i:A_i \rightarrow B_i$ for $1 \le i \le n$ , is $\frac{1}{n}(f_1 \otimes ... \otimes f_n):A_1 \otimes ... \otimes A_n \rightarrow B_1 \otimes ... \otimes B_n$

Jean-Baptiste Vienney (Mar 03 2024 at 21:46):

Jean-Baptiste Vienney said:

Probably even less useful: I'm saying this because it happened that one day I used a notion of "probabilistic permutation" (something that I called a "blender") in a symmetric monoidal $\mathbb{Q}_{\ge 0}$ -linear category that I defined as a natural transformation $A^{\otimes n} \rightarrow A^{\otimes n}$ of the form $\frac{1}{|M|}\underset{\sigma \in M}{\sum}\sigma$ for some multiset $M$ of permutative natural transformations $A^{\otimes n} \rightarrow A^{\otimes n}$ .

By the way, if ever it was useful, I found that you can write 12 different commutative diagrams involving the interaction of three different operations on blenders derived from tensor product, composition and enrichment. So I found this notion very cool.

Jean-Baptiste Vienney (Mar 03 2024 at 21:47):

Now I stop talking about my stuff :grinning_face_with_smiling_eyes:

Jean-Baptiste Vienney (Mar 03 2024 at 21:56):

Can someone recall the interpretation of a limit as the solution of an optimization problem? I think it would be useful in order to understand if there is really something probabilistic behind this idea or if it is really vague to say that it has anything to do with some optimization.

Jean-Baptiste Vienney (Mar 03 2024 at 21:57):

Sometimes mathematicians talk about some "philosophical interpretation" of some concept and I often find it rather vague (like the interpretation of Yoneda lemma).

Jean-Baptiste Vienney (Mar 03 2024 at 22:08):

Maybe, can someone also recall a definition of optimization of something in probability/statistics in order that we can compare the two ideas ? It could help in order to understand how to transfer the idea of mean from probability/statistics to category theory.

Jean-Baptiste Vienney (Mar 03 2024 at 22:20):

And does optimization really have anything to do with means? It seems to me that aggregation and thus means have to do with tensor products which are not limits.

Peva Blanchard (Mar 03 2024 at 22:49):

Jean-Baptiste Vienney said:

Sometimes mathematicians talk about some "philosophical interpretation" of some concept and I often find it rather vague (like the interpretation of Yoneda lemma).

You're right ! Now that you mention it, I'm not sure anymore if "it is well-known that limit/co-limits are some kind of optimization problems". I think I have in mind the case of posets with least upper bounds / greatest lower bounds. But the narrative certainly is too vague.

Jean-Baptiste Vienney (Mar 03 2024 at 22:59):

This probably not intelligent, but you could average objects like this in a a compact closed category with coproducts (= biproducts)
$(I \oplus ... \oplus I)^*(A_1 \oplus ... \oplus A_n)$ ( $n$ copies of $I$ )

Peva Blanchard (Mar 03 2024 at 23:01):

Jean-Baptiste Vienney said:

And does optimization really have anything to do with means? It seems to me that aggregation and thus means have to do with tensor products which are not limits.

Regarding ways to interpret "averaging" in category theory, there is a very interesting line of work by Tom Leinster and Tom Avery on codensity monads.

I still have to digest the details, so I'm going to be very hand-wavy. The "slogan" I understood is that the elements of the monad behave like integration operators. The usual integration operators are retrieved when considering the inclusion of a certain sub-category of the category of measurable spaces, whose codensity monad is the Giry monad. When considering the inclusion $FinSet \rightarrow Set$ , we get the ultrafilter monad. And the corresponding integration operator corresponds to "voting w.r.t the ultrafilter".

Peva Blanchard (Mar 03 2024 at 23:05):

But for other reasons, I am also interested in the median (because it's a basic example of "robust" statistics). Hence me wondering we could categorify the "median". I thought that there could be a way through the formulation of the median as an optimization problem. But clearly, I'm not sure it's the right way.

Jean-Baptiste Vienney (Mar 03 2024 at 23:10):

I don't understand much but does it have anything to do with integrating with respect to a probability measure giving the probability of an event? and using the analogy coend $\leftrightarrow$ integral?

Jean-Baptiste Vienney (Mar 03 2024 at 23:11):

I'm just asking naive questions, I don't know much about codensity monads and probability theory but like to think and learn about any stuff :sweat_smile:

Jean-Baptiste Vienney (Mar 03 2024 at 23:13):

Also, maybe weighted coends would be useful to average if this notion makes sense

Peva Blanchard (Mar 03 2024 at 23:26):

Well I will have trouble to answer, because I don't know much about codensity monads either ^^' they are on my list of "things it would be good to understand at some point".

However, it's not the analogy between coend and integral, although there is a formula involving an end.

If you have a functor $G : B \rightarrow A$ , then you can ask for the right Kan extension of $G$ along itself, which, if the extension exists, yields a functor $T^G : A \rightarrow A$ . The universal property of the Kan extension makes $T^G$ a monad on $A$ .

There is a formula using ends. For any object $X$ of $A$

$T^G X = \int_Y [ A(X, GY), GY]$

where $[\_,\_] : Set \times A \rightarrow A$ is a power.

Intuitively, an element of $T^G X$ is a machine that takes, for any $Y$ , an element of $A(X, GY)$ and outputs an element of $GY$ . In other words, an element of $T^G X$ acts like an integration operator for functions defined over $X$ .

One instance of this machinery lead to $T^G$ being the Giry monad, i.e., the monad that associates to every measurable set $X$ the set of probability distributions over $X$ . Any such probability distribution on $X$ defines an integration operator for measurable functions on $X$ , i.e., the operator that output the expected value.

Jean-Baptiste Vienney (Mar 03 2024 at 23:46):

I'm thinking about something completely different. I think a one-element category enriched over commutative monoids such that every non-zero morphism is invertible and $id+...+id$ is always nonzero is the same as a semi-field of characteristic $0$ (the multiplication is given by the composition)

Jean-Baptiste Vienney (Mar 03 2024 at 23:47):

And in a semi-field of characteristic $0$ you can average by the formula $\frac{1}{n}(x_1+...+x_n)$

Jean-Baptiste Vienney (Mar 03 2024 at 23:48):

which becomes here $(id+...+id)^{-1};(x_1+...+x_n)$

Jean-Baptiste Vienney (Mar 03 2024 at 23:51):

I don't think it will help you in any way :joy:

Jean-Baptiste Vienney (Mar 03 2024 at 23:55):

Categorifying this, you get that if you have parallel morphisms $f_1,...,f_n:A \rightarrow B$ in such a category, then you can define the mean as $(id+...+id)^{-1};(f_1+...+f_n)$ but this just what I was saying before, nothing amazing (something like this, there is some nonsense in what I've written I think).

Jean-Baptiste Vienney (Mar 04 2024 at 00:07):

I should not talk when I don't have anything interesting to say and don't understand half of what is written, sorry.

Peva Blanchard (Mar 04 2024 at 07:14):

Jean-Baptiste Vienney said:

I should not talk when I don't have anything interesting to say and don't understand half of what is written, sorry.

Oh no don't worry for that :)

JR (Mar 04 2024 at 13:12):

Jean-Baptiste Vienney said:

By the way, why do you want to average objects? Isn't it more natural to average morphisms which can be interpreted as processes?

[Arrow category enters the chat]

James Deikun (Mar 04 2024 at 13:18):

Peva Blanchard said:

Well, I tried to look at some concrete examples, when we take only two objects, say $a$ and $b$ .

With the enriching category $V = (Set, \times, 1)$ . We are looking at the full sub-category $\Gamma$ of $Set$ spanned by the

$\phi z = X(a, z) \times X(b, z)$

Now, if $X$ has a terminal object $1$ , then

$\phi 1 = X(a, 1) \times X(b,1) \simeq 1$

which is terminal in $Set$ , and thus terminal in $\Gamma$ as well. This implies that the "median" of $a$ and $b$ in $X$ is just the terminal object of $X$ .

Well, in a category with a terminal object, the "distance" from any object to the terminal object is the monoidal unit and terminal object of $\mathsf{Set}$ , aka "zero", so it's not surprising that it turns out this way when minimizing total distance...if you had the same situation in a metric space the same would happen.

Eric M Downes (Mar 12 2024 at 15:03):

Jean-Baptiste Vienney said:

Jean-Baptiste Vienney said:

Probably even less useful: I'm saying this because it happened that one day I used a notion of "probabilistic permutation" (something that I called a "blender") in a symmetric monoidal $\mathbb{Q}_{\ge 0}$-linear category that I defined as a natural transformation $A^{\otimes n} \rightarrow A^{\otimes n}$ of the form $\frac{1}{|M|}\underset{\sigma \in M}{\sum}\sigma$ for some multiset $M$ of permutative natural transformations $A^{\otimes n} \rightarrow A^{\otimes n}$.

By the way, if ever it was useful, I found that you can write 12 different commutative diagrams involving the interaction of three different operations on blenders derived from tensor product, composition and enrichment. So I found this notion very cool.

This is very interesting. I would love to see a writeup of this.

Eric M Downes (Mar 12 2024 at 15:20):

Paolo Perrone said:

$\coprod_x C(z,x) \times C(x,z)$
Now in enriched category theory we can replace the Cartesian product $\times$ with a more general monoidal structure $\otimes$ . However, as far as I know, we cannot replace the coproduct by a more general other monoidal structure, over which $\otimes$ distributes.

What if you formed a presheaf over the collection of products instead of a coproduct over them? Sheaves come equipped with stalks that can be used for convergence/ultrafilter kind of arguments -- if you have any measure of closeness or topology you could somehow filter down to the stalk which contains the single product most representative of the collection. This might have to do with the rate of stalk "narrowing" -- how many products it loses how fast... That's what we use averages and medians for anyway -- pick out a "representative member" of an enclosing set; here there may not be an enclosing thing, so we do our best with the members we have?

(Having equipped a site with a sheaf thus, you also usually get a sheaf of rings, using product and composition, I think... there is a forced sense in which $\circ$ could distribute over $\times$ by broadcasting arguments, though I have no idea if any of that gives you a desirable distributive property.)

Eric M Downes (Mar 12 2024 at 15:47):

I think all I have described reduces to taking a product of products, and then looking at various quotients of that which project down some products and lose others. Compositions of projectors then form a sequence that gives you the stalks on objects. The hope would be that sheaves might also give you more, but just looking at the "rate of change" of the cost over these sequences ( $\pi_{A\times G\times Z}\circ\pi_{A\times B\times C\ldots G\ldots Z}\circ\pi_{A\times\ldots \times Z}$ ) might be illuminating, and would require no extra machinery... it looks vaguely geometric mean adjacent? I don't know how to actually access or manipulate them however, as I know only the rudiments of cost categories.