You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
I am trying to get a better grasp on studying the structures determined by multisorted equational theories, especially when two theories determine the same structure.
For example, given one type, , one constant, , and 2 functions, and , and no additional equations, terms of this theory would include:
It would be nice to have a general routine for studying the structure of any such theory (given the signature, and some equations).
In this scenario, I seem to be treating function-application identically to symbol-concatenation. As and , I know is a valid term. I am wondering if I should formalize very precisely what I mean by function-application here, though.
If was a set, and if I said is a set such that and such that , I believe this leaves open the possibility that , so . If we think about a function in the set-theoretic sense as that is left-total and right-unique, then I believe we can present a model of this theory as ("the interpretation of the constant symbol is the set containing only the set which is the interpretation of the constant symbol "), and . Since and , this structure satisfies the theory.
When one works with term algebras, it seems to be assumed that , unless that condition is stated in the theory. I guess I feel like this isn't conveyed if we say that is "the function applied to the element ", if we have not given more information about what other elements there may or may not be, in . We know that is a well-defined expression that refers to some element in , but not which one.
I wonder if it is taken for granted in type theory that if is a type, and is a term of type , and is a function type, and is a term of that function type, then is a term of type , and there isn't really a matter of consideration of "what equals", it's 'just '? Unless we introduce equality as a type, maybe, to state what equals?
One simple way to think about associativity, I think, is that "nested expressions get flattened": . If we make the operation implicit and denote applying it to and as , we see "flattening": .
Screenshot 2025-02-02 at 1.12.28 PM.png
I think I am observing that in the case of the first-mentioned theory, , because the terms can be written out as strings (), it seems like there is "associativity" going on here; we don't need to designate an order of operations, in the strings. But I think this is because these are unary operations.
If we associate with , with , and with the empty string "", the terms of this theory become binary strings:
So I would like to say that this theory is isomorphic to the free monoid with generators and identity element "", if I may.
I find this interesting yet counter-intuitive, because what were functions in the first theory became elements in the second theory, and what was an implicit idea in the first theory, "function application", became an explicit operation in the second one, the monoid operation.
What is a nice mathematical way to express that this free algebraic theory of 2 unary functions and one constant turns out to be identical to the theory of a free monoid on 2 generators (I think)?
That the terms in this theory are lists is an artifact of only having unary function symbols. In general, terms will be trees. And usually, fc is written f(c) in the general case, and the meaning of this as function application is dictated by considering Set-valued functors (standard models). That f(c) != c in the term model is true by construction, not an assumption. (The initial/term model will satisfy theorems that most models don't, such as induction.)
Ok.
My goal is to classify CQL theories.
I guess I have a lot of reading and studying to do.
The nLab doesn't have an entry for CQL. I feel like it should. It would be a good way to gather all the essential info one needs to understand it.
Perhaps this is the best self-contained, formal and complete explanation of CQL. Algebraic Databases.
May take some notes on the paper to digest a bit further.
There have been various iterations of categorical database theory. Previously, instances of schemas were defined as functors from a category into the category of sets. Intuitively, one can imagine that if a schema includes a type , the instance-functor associates with a set of elements which are understood to be of type .
A newer approach uses Lawvere theories.
The concepts of categorical database theory, such as schemas, instances, schema mappings, and queries, can be modeled as proarrow equipments, a.k.a. framed bicategories.
Another approach to categorical database theory by Robert Rosebrugh and collaborators takes schemas as sketches. A sketch is a formalization of the concept of a theory. It generalizes the notion of a Lawvere theory. A sketch is a small category equipped with a set of cones and a set of cocones. A small category is a category with a small set of objects and a small set of morphisms. A small set has various definitions depending on the formalization used. It can mean a set and not a proper class. Or, using Grothendieck universes, it can mean a set within a universe . I will certainly be taking a detour into studying some examples of [[sketches]]. If is a sketch, a model of in a category is a functor taking each cone in to a limit cone in , and similarly for cocones.
An alternative definition of sketch is a directed graph with a set of diagrams , set of cones , and set of cocones . Example: consider the directed graph . Let be empty. Because there are no diagrams, any selection of a point is a valid cone (I think), since the point is not required to have any morphisms. Let contain just the point . Let be empty. A model of this sketch is a functor which sends the cone to a limit cone. For reasons unclear to me, must be the terminal object , and if this is a functor into , can be any set. This means models of this sketch are pointed sets (sets with a distinguished element).
This is enough to digest. I will explore sketches and their models before continuing.
The reason we had to layer categories on top of algebraic theories to do data migration/integration was the same reason database migration/integration theory isn't just finite model theory - you need to distinguish two kinds of value, those coming from the input data and those generated by the data integration/migration operations; this is because unlike query, where the relational algebra cannot express queries that have values in their output not contained in their input. You end up with "unknowns"/"labelled nulls" when integration/migrating data with colimits/lifting problems that you don't when you just query (limits and coproducts) and to account for this semantics you have to enrich the original Spivak model, the original Rosebrugh model, etc. There are a few approaches to this "attribute problem" (our terminology from 'Algebraic Data Integration'). BTW Rosebrugh's EASIK tool is available inside of CQL, you can push a button to translate CQL to EASIK and vice versa.