Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: learning: reading & references

Topic: Datasets with operadic structure

Eric Dolores (Dec 27 2025 at 02:04):

Hello, with a team we are finishing a paper that describes an algorithm, and we are looking for datasets to test our algorithm. I hope this counts as asking for references. If not I can move the question somewhere else.

The context is posets where every node has a value, and the values are suppose to satisfy the inequalities of the poset. You can think of graphic monoids/commitment networks as in complex-adaptive-systems-part-7 (Perhaps PERT diagrams too?), where every time you make a commitment you store some numerical information.

Now in real life, the measurement of experimental results have errors (human errors, or maybe when the algorithm run the internet crashed because everyone uses the same cloud service, etc). GPAV and their descendants are algorithms that, given a poset $P$ and $Y$ a set of values, one value per node of the poset, the algorithm GPAV returns a new $Y^\prime$ that satisfies the inequalities of the poset and $Y^\prime$ is not too far from $Y$ in some precise sense.

We adapted GPAV to the operadic composition of posets ( lexicographic sum) and we obtain improvements by using the algebraic information. That means, if you a priory knows that $P=Q(R_1,...R_n)$ where $Q, \{R_i\}_{i}$ are posets and P is the operadic composition of posets, then instead of applying GPAV to $P$ we apply it to $R_i$ 's and to the output after lexicographic sum with $Q$ . We use the information " $P$ can be factored", to optimize our algorithm (reducing the number of operations) making our algorithm different (and faster) than the segmentation based GPAV (for those that know about it).

The algorithm is ready, but now we are struggling to find datasets. Raw datasets with the right information may be preprocessed and the operadic information being discarted (that is, people publishes $P$ even if they originally have $Q,\{R_i\}_{i}$ . Or people is just not thinking this way when they record their data.

So, if anyone can suggest a dataset, for us to make experiments, we will be very thankful.

We will make the paper and code available as soon as we finish this experiments.

Chad Nester (Dec 30 2025 at 11:07):

The context is posets where every node has a value, and the values are suppose to satisfy the inequalities of the poset.

Can you make this more precise? Do you mean a poset homomorphism/monotone function?

Eric Dolores (Dec 30 2025 at 13:31):

Yes, given a finite poset $P$ and $f$ an (weak) order preserving map from $P$ to $R$, consider $W=\{f(q)|q\in P\}$.
Now, we study the case where we don't know the map $f$, and when we measure $W$, there is noise. So for every $q\in P$ we measure $y_q=f(q)+r_q$ where $r_q$ is unknown but small.

The weak order preserving condition means non decreasing. This is part of Isotonic regression.