You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Hi,
I have questions regarding ways to "aggregate things". We do have well-known notions of statistics, like the "average" or the "median" of a collection of numbers. It is also well-known that these statistics can be reformulated as optimization problems, e.g.:
or
It is also well-known that limits/co-limits are some kind of optimization problems. So it seems natural to ask if one can reformulate the average or median (or similar) of a set of objects in category-theoretical terms.
I thought it would be easy to google my way through, but it turns out not to be so easy to find.
Are there known references that approach that question?
ps: I am thinking about something as follows. Let be a -enriched category (à la Lawvere metric space), and a finite subset of objects of . For any object of , we can compute the "total cost" in
with the hom-object in . Then we can look for the terminal object (minimizer) of the full subcat spanned by the , and check how it can be realized in , and so on. Or something like that.
pps: The closest paper I found was a paper by Matilde Marcolli on "Pareto categories", where she provides a reformulation of Pareto optimization in the context of categories of resources.
ppps: It's also quite funny that the "median" may possibly be easier to reformulate than the "average". For the average we would need to give meaning to quadratic expressions for objects in .
A quadratic expression might use both and in a nicely symmetric way.
Exactly. I've been wondering this myself:
We would like to have something along these lines:
which seems very similar to this:
Now in enriched category theory we can replace the Cartesian product with a more general monoidal structure . However, as far as I know, we cannot replace the coproduct by a more general other monoidal structure, over which distributes.
(And if we enrich in real numbers, the coproduct is either "sup" or "inf", not the usual sum.)
So what I think is going on here is a further generalization of the idea of enriched category.
I've been thinking for a while about this, without ever managing to figure it out. If anyone has concrete ideas, I'd love to hear about them!
For context, this would not just be for averages, but in general to talk about (usual) matrix multiplication.
Can all generalized means be formulated as optimization problems? (in particular, can the geometric mean?)
Ralph Sarkis said:
Can all generalized means be formulated as optimization problems? (in particular, can the geometric mean?)
Yes, there is a summary table on Wikipedia.
Paolo Perrone said:
Exactly. I've been wondering this myself:
We would like to have something along these lines:
which seems very similar to this:
Now in enriched category theory we can replace the Cartesian product with a more general monoidal structure . However, as far as I know, we cannot replace the coproduct by a more general other monoidal structure, over which distributes.
(And if we enrich in real numbers, the coproduct is either "sup" or "inf", not the usual sum.)
So what I think is going on here is a further generalization of the idea of enriched category.I've been thinking for a while about this, without ever managing to figure it out. If anyone has concrete ideas, I'd love to hear about them!
This looks very tempting to play with indeed! I guess we could just assume that has two compatible monoidal structures, like . Aren't there papers about -enriched categories?
We can also think about matrices as profunctors, and composition is given by the coend
I'm sure you already know that. But then why the "coend" picture doesn't fit the bill for you?
I think the "median" given in the original question is just the product in a Lawvere metric space actually.
Furthermore, it's actually a "medoid" and depending which Lawvere metric space structure you put on the sample space it captures a lot of the different generalized means.
Peva Blanchard said:
Paolo Perrone said:
Exactly. I've been wondering this myself:
We would like to have something along these lines:
$\sum_{x} d(z,x) \cdot d(x,z)$
which seems very similar to this:
$\coprod_x C(z,x) \times C(x,z) .$
Now in enriched category theory we can replace the Cartesian product $\times$ with a more general monoidal structure $\otimes$. However, as far as I know, we cannot replace the coproduct by a more general other monoidal structure, over which $\otimes$ distributes.
(And if we enrich in real numbers, the coproduct is either "sup" or "inf", not the usual sum.)
So what I think is going on here is a further generalization of the idea of enriched category.I've been thinking for a while about this, without ever managing to figure it out. If anyone has concrete ideas, I'd love to hear about them!
This looks very tempting to play with indeed! I guess we could just assume that $V$ has two compatible monoidal structures, like $(Vect, \otimes, \oplus)$. Aren't there papers about $Vect$-enriched categories?
We can also think about matrices as profunctors, and composition is given by the coend
I'm sure you already know that. But then why the "coend" picture doesn't fit the bill for you?
In nice situations, a coend is a coequalizer of a coproduct. I guess one can replace the "coproduct" part and keep the equalizer? I don't know.
I can't come up with a consistent definition of "generalized enriched category" in that context though.
James Deikun said:
I think the "median" given in the original question is just the product in a Lawvere metric space actually.
Mmh, I'm not sure about that, because it seems to me that the product in a Lawvere metric space is rather ill-behaved.
Let be Lawvere's Cost category. Let be a weighting over (seen as a discrete category), and be the corresponding weighted product of objects of (assuming it exists). Then, unfolding the definition of weighted limit, we have for every object in , and every other object in
where the minus sign stands for truncated subtraction. In particular, if the weighting assigns to every object of , then the limit is at distance from every object in , which, most of the time, is possible only when is a singleton.
Taking co-products will not help much either as it will just swap for . It seems to me that the median/medoid/mean/etc. cannot be expressed as a weighted limits/co-limits of this kind.
Ah, you're right, I guess it's not quite a limit. Actually you could view as the space of cones with vertex and here instead of the one terminal cone you're looking for an object so that every cone over is represented by some cone with apex instead of a particular one. So it's a particular kind of multi-limit I guess?
Well, I tried to look at some concrete examples, when we take only two objects, say and .
With the enriching category . We are looking at the full sub-category of spanned by the
Now, if has a terminal object , then
which is terminal in , and thus terminal in as well. This implies that the "median" of and in is just the terminal object of .
This argument, however, does not apply if we choose the set-co-product as monoidal product instead, i.e., . The full sub-cat is spanned by
for all in . If has a terminal object , we have
which is not terminal in .
If we assume that is terminal in for some in , it means we have, for all , a unique function
I think this implies that , which in turn implies
or
Either way, it does not look very promising for .
I don't know if it's useful but I'm thinking to this: if you have a category enriched over -modules, then you can average a finite set of morphisms by taking .
Probably even less useful: I'm saying this because it happened that one day I used a notion of "probabilistic permutation" (something that I called a "blender") in a symmetric monoidal -linear category that I defined as a natural transformation of the form for some multiset of permutative natural transformations .
This all I know about probability + categories enriched over the category of (semi) vector spaces over a (semi) field :sweat_smile:.
By the way, why do you want to average objects? Isn't it more natural to average morphisms which can be interpreted as processes?
When you take a limit, you take the limit of a cone, so it involves a set of morphisms rather than just a set of objects. I think the morphisms are the context which is creating the combinatorics of the situation thus allowing an interpretation in terms of optimization.
(just some random thoughts)
Generalization (I think, maybe loosely) of the first sentence I wrote by replacing the addition by a tensor product and allowing the domains and codomains to be different: If you have a monoidal category -linear category (ie. enriched over -modules/semi vector spaces), you could say that the average of the for , is
Jean-Baptiste Vienney said:
Probably even less useful: I'm saying this because it happened that one day I used a notion of "probabilistic permutation" (something that I called a "blender") in a symmetric monoidal -linear category that I defined as a natural transformation of the form for some multiset of permutative natural transformations .
By the way, if ever it was useful, I found that you can write 12 different commutative diagrams involving the interaction of three different operations on blenders derived from tensor product, composition and enrichment. So I found this notion very cool.
Now I stop talking about my stuff :grinning_face_with_smiling_eyes:
Can someone recall the interpretation of a limit as the solution of an optimization problem? I think it would be useful in order to understand if there is really something probabilistic behind this idea or if it is really vague to say that it has anything to do with some optimization.
Sometimes mathematicians talk about some "philosophical interpretation" of some concept and I often find it rather vague (like the interpretation of Yoneda lemma).
Maybe, can someone also recall a definition of optimization of something in probability/statistics in order that we can compare the two ideas ? It could help in order to understand how to transfer the idea of mean from probability/statistics to category theory.
And does optimization really have anything to do with means? It seems to me that aggregation and thus means have to do with tensor products which are not limits.
Jean-Baptiste Vienney said:
Sometimes mathematicians talk about some "philosophical interpretation" of some concept and I often find it rather vague (like the interpretation of Yoneda lemma).
You're right ! Now that you mention it, I'm not sure anymore if "it is well-known that limit/co-limits are some kind of optimization problems". I think I have in mind the case of posets with least upper bounds / greatest lower bounds. But the narrative certainly is too vague.
This probably not intelligent, but you could average objects like this in a a compact closed category with coproducts (= biproducts)
( copies of )
Jean-Baptiste Vienney said:
And does optimization really have anything to do with means? It seems to me that aggregation and thus means have to do with tensor products which are not limits.
Regarding ways to interpret "averaging" in category theory, there is a very interesting line of work by Tom Leinster and Tom Avery on codensity monads.
I still have to digest the details, so I'm going to be very hand-wavy. The "slogan" I understood is that the elements of the monad behave like integration operators. The usual integration operators are retrieved when considering the inclusion of a certain sub-category of the category of measurable spaces, whose codensity monad is the Giry monad. When considering the inclusion , we get the ultrafilter monad. And the corresponding integration operator corresponds to "voting w.r.t the ultrafilter".
But for other reasons, I am also interested in the median (because it's a basic example of "robust" statistics). Hence me wondering we could categorify the "median". I thought that there could be a way through the formulation of the median as an optimization problem. But clearly, I'm not sure it's the right way.
I don't understand much but does it have anything to do with integrating with respect to a probability measure giving the probability of an event? and using the analogy coend integral?
I'm just asking naive questions, I don't know much about codensity monads and probability theory but like to think and learn about any stuff :sweat_smile:
Also, maybe weighted coends would be useful to average if this notion makes sense
Well I will have trouble to answer, because I don't know much about codensity monads either ^^' they are on my list of "things it would be good to understand at some point".
However, it's not the analogy between coend and integral, although there is a formula involving an end.
If you have a functor , then you can ask for the right Kan extension of along itself, which, if the extension exists, yields a functor . The universal property of the Kan extension makes a monad on .
There is a formula using ends. For any object of
where is a power.
Intuitively, an element of is a machine that takes, for any , an element of and outputs an element of . In other words, an element of acts like an integration operator for functions defined over .
One instance of this machinery lead to being the Giry monad, i.e., the monad that associates to every measurable set the set of probability distributions over . Any such probability distribution on defines an integration operator for measurable functions on , i.e., the operator that output the expected value.
I'm thinking about something completely different. I think a one-element category enriched over commutative monoids such that every non-zero morphism is invertible and is always nonzero is the same as a semi-field of characteristic (the multiplication is given by the composition)
And in a semi-field of characteristic you can average by the formula
which becomes here
I don't think it will help you in any way :joy:
Categorifying this, you get that if you have parallel morphisms in such a category, then you can define the mean as but this just what I was saying before, nothing amazing (something like this, there is some nonsense in what I've written I think).
I should not talk when I don't have anything interesting to say and don't understand half of what is written, sorry.
Jean-Baptiste Vienney said:
I should not talk when I don't have anything interesting to say and don't understand half of what is written, sorry.
Oh no don't worry for that :)
Jean-Baptiste Vienney said:
By the way, why do you want to average objects? Isn't it more natural to average morphisms which can be interpreted as processes?
[Arrow category enters the chat]
Peva Blanchard said:
Well, I tried to look at some concrete examples, when we take only two objects, say and .
With the enriching category . We are looking at the full sub-category of spanned by the
Now, if has a terminal object , then
which is terminal in , and thus terminal in as well. This implies that the "median" of and in is just the terminal object of .
Well, in a category with a terminal object, the "distance" from any object to the terminal object is the monoidal unit and terminal object of , aka "zero", so it's not surprising that it turns out this way when minimizing total distance...if you had the same situation in a metric space the same would happen.
Jean-Baptiste Vienney said:
Jean-Baptiste Vienney said:
Probably even less useful: I'm saying this because it happened that one day I used a notion of "probabilistic permutation" (something that I called a "blender") in a symmetric monoidal $\mathbb{Q}_{\ge 0}$-linear category that I defined as a natural transformation $A^{\otimes n} \rightarrow A^{\otimes n}$ of the form $\frac{1}{|M|}\underset{\sigma \in M}{\sum}\sigma$ for some multiset $M$ of permutative natural transformations $A^{\otimes n} \rightarrow A^{\otimes n}$.
By the way, if ever it was useful, I found that you can write 12 different commutative diagrams involving the interaction of three different operations on blenders derived from tensor product, composition and enrichment. So I found this notion very cool.
This is very interesting. I would love to see a writeup of this.
Paolo Perrone said:
Now in enriched category theory we can replace the Cartesian product with a more general monoidal structure . However, as far as I know, we cannot replace the coproduct by a more general other monoidal structure, over which $\otimes$ distributes.
What if you formed a presheaf over the collection of products instead of a coproduct over them? Sheaves come equipped with stalks that can be used for convergence/ultrafilter kind of arguments -- if you have any measure of closeness or topology you could somehow filter down to the stalk which contains the single product most representative of the collection. This might have to do with the rate of stalk "narrowing" -- how many products it loses how fast... That's what we use averages and medians for anyway -- pick out a "representative member" of an enclosing set; here there may not be an enclosing thing, so we do our best with the members we have?
(Having equipped a site with a sheaf thus, you also usually get a sheaf of rings, using product and composition, I think... there is a forced sense in which could distribute over by broadcasting arguments, though I have no idea if any of that gives you a desirable distributive property.)
I think all I have described reduces to taking a product of products, and then looking at various quotients of that which project down some products and lose others. Compositions of projectors then form a sequence that gives you the stalks on objects. The hope would be that sheaves might also give you more, but just looking at the "rate of change" of the cost over these sequences () might be illuminating, and would require no extra machinery... it looks vaguely geometric mean adjacent? I don't know how to actually access or manipulate them however, as I know only the rudiments of cost categories.