You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Hello, friends. Recently, I've read an article from Spivak (2015) where he talked about how nested data structures had not yet been formalized in category theory. I'm quite interested in this topic, I was wondering if anyone is aware of advances in such front?
Not me. Maybe someone else here knows. If not, it would be good to email David Spivak and let us know his answer!
As in "nested inductive types"?
David showed how a schema can be thought of as a category, and thus used it to rigorously define things like data migration. I was wondering if there was something similar, but with nested data structures.
Are you aware of datatypes represented as polynomial functors/containers?
https://www.cs.nott.ac.uk/~psztxa/publ/fossacs03.pdf
http://www.contrib.andrew.cmu.edu/~avigad/Papers/qpf.pdf
Thanks for the Avigad et al paper @Eric Bond that I didn't know about. but 2019 is a bit old too, what happened since? do you know?did they complete their Lean library? has it been ported to agda?
I’m not sure what has happened since. Im sure a fair amount of their constructions have been replicated in Agda. What would be interesting is a reexamination of this paper through the lense of HoTT where we can use higher inductive types or quotient inductive inductive types instead of Lean’s notion of quotient.
Didn't know about it, @Eric Bond . Thanks for the references.
I once asked in Strathclyde why containers are not used in proof assistants. If I remember correctly, the answer was that you need functional extensionality to prove the container representation theorem (aka that )
Another reason containers are not used in proof assistants is that the direct implementation isn't very efficient. Using a function to represent a small tuple is usually going to be worse, and involve more unnecessary recomputation, than just storing the values.
Thanks @Bob Atkey and @Jules Hedges , this makes sense. but I wish @Conor McBride would comment, as when this zulip started he seemed to have high hopes for the implementation of polynomials, if I understood correctly what he was saying then. (always a big if...)
There's combinatorial species which is very similar to datataypes-as-functors.Maybe this may be of use since from the abstract it tries to reconcile ADTs and species through HoTT? Don't know about nesting though. Is there any reason it couldn't just be regular composition of functors?
https://www.cis.upenn.edu/~sweirich/papers/yorgey-thesis.pdf
The idea is the formalize the interplay between the schema category defined by @David Spivak , and a representation of the data as a nested collection. David formalized the notion of a database schema, where one defines the schema as a category S
and the instance of a database is a functor (everyone here probably knows this better than me). So what I'm seeking is a way to represent the nested collection in a similar way, such that one can translate an instance from a relational db to a nested collection.
I'm still very new to category theory (started studying this year actually), so it's very likely that the solution might not at all complicated, and I'm just unable to grasp it. Or it might be quite complicated, and I just have no idea what I'm getting into.
You can do category theory in category theory so I get where you're coming from - I'm self taught and CT is less of a dense forest and more like fractal matroshka doll where you can model one thing in various equally abstract ways which turn out to be equivalent for some notion of equivalence.It's the most useful empty bowl in math.
Hi @Davi Sales Barreira: Nesting (also related to aggregation) turns out to be surprisingly tricky in this context. Spivak has made some recent progress on the problem using polynomial functors, but I wouldn't say the issue is resolved yet,
To understand what's going on, lets build a schema with one object (entity) and one arrow .
In order to "disaggregate" the subsets, we could extend the schema with a [[bundle]] over . If you're not familiar with this idea, we can represent a families of sets indexed by a function ; each element corresponds to the set of 's that map to it (the [[fiber]]).
This forgets about the "identity" of the elements in the list, but we can put that back in by adding an attribute . So the new schema will look like this
Using bundles and spans to replace nested data structures is a relatively routine move in database engineering.
The problem of aggregation is basically the reverse: if we have a bundle in our schema, how can we calculate aggregate operation like a sum over the fibers. The simplest case is just counting how many elements are in a fiber (or even whether there are any).
"Aggregate attributes" like this don't behave in the same way as ordinary attributes; in particular, they are not preserved by natural transformations. You can check this paper for a discussion of some of the issues involved.
thanks a lot @Spencer Breiner ! Some of the concepts are new to me, and I'm still trying to figure out the machinery necessary. You gave me a lot to think about. Indeed, I'm aware of the Functorial Aggregation paper by Spivak. It's in my reading list ;D I just didn't start yet because I need more solid foundation in CT.
I've actually set a message to David Spivak asking whether the nested structure had already been figured out, since it was related to the aggregation. He told me that it was still not resolved.
The Functorial Aggregation paper is pretty hard. I'm still digesting it myself. The scheduling paper I pointed to is intended to be accessible, so that's probably a good place to start.
Do you have an example or use case that you want to apply? I find it helps to have a concrete example in front of me when I want to understand an abstract concept.
A good exercise is to think about the difference between representing a subset in a schema as an arrow versus a truth function . The instances are the same, but the morphisms are different.
Whoops! We actually need a limit sketch/schema (rather than just a plain category) to get the instances to agree.
See Analytic Combinatorics - Algorithms Project pg.124
Example using combinatorial species -
Set Partitions: F=set(set(Z, card≥1)) while the recursive version is Total Partitions: H=Z+set(H, card≥2).
Spencer Breiner said:
Do you have an example or use case that you want to apply? I find it helps to have a concrete example in front of me when I want to understand an abstract concept.
Yeah, I do.
Daniel Geisler said:
See Analytic Combinatorics - Algorithms Project pg.124
Example using combinatorial species -
Set Partitions: F=set(set(Z, card≥1)) while the recursive version is Total Partitions: H=Z+set(H, card≥2).
Thanks, I'll take a look.
@Spencer Breiner , I'll take a look at the article you posted. At the moment, I've mostly studied CT via more applied books (e.g. Seven Sketches; Category Theory for the Sciences). I'm actually finishing Category Theory for the Sciences, do you recommend any follow up book? Many of the terms you've used I haven't actually encountered in these books.
I'm interested in the interplay between a relational db and a nested data structure. I want to formalize how to take RDB instances to a nested format.