An incremental, modular approach to coherence theorems · theory: category theory

Consider the following two formal "theories", where a theory is essentially a type and kind context. Our logic has the restriction that all types, type constructors and function symbols live left of the context separator

\mid

; everything to the right of

\mid

is purely first-order, i.e., there are no function types, no partial function applications, etc.

There are two obvious context morphisms

\Xi_F\to \Xi_F

allowing the theory of a functor to interpret the theory of a category; intuitively this is somehow dual to the idea that the "walking category" should embed into the "walking functor" in two different ways.

Both of these kind context morphisms can be extended to first-order type context morphisms in a natural way:

Both of these context morphisms respect equality, in the sense that the equational part of the theory of a functor extends the equational
theory of a category: if two morphisms definable over

\Xi_C, \Gamma_C

are provably equal, then they are provably equal when reindexed along

(\Phi_C, \Psi_C)

, and provably equal when reindexed along

(\Phi_D,\Psi_D)

as well. Note that this would be true without any interesting coherence conditions on

F

, for example if

F

was only a graph homomorphism this would still be true.

A more interesting theorem, and one that relies on the coherence conditions of a functor, asserts that for any two objects

c,c'

definable over

(\Xi_C,\Gamma_C)

(so

c,c'

can only be

x,y

), the reindexing maps

(\Phi_C,\Psi_C)^\ast

and

(\Phi_D,\Phi_D)^\ast

each induce a bijection between terms of

Hom(c,c')

and terms of

Hom(\Psi^\ast_C c, \Phi^\ast_C c')

(respectively, terms of

Hom(\Psi^\ast_D c, \Phi^\ast_D c')

) up to provable equality. In all cases the sets

Hom(c,c')

contains either zero or one term up to provable equality
, and the reindexing functors establish a bijection in all cases.

Therefore, I suggest that an interesting guiding principle when studying coherence constructions in category theory is that when large theories
are built out of smaller theories, we should associate to these smaller theories certain families of types over given contexts,
and ask that the projection morphisms from the large theories to the smaller morphisms should establish bijections between these sets up to provable equality. Intuitively, the definition of a functor respects the unique definability constraints of the definition of a category, as well as adding its own
new unique definability constraints. This can be used as a guide when designing a formal library for category theory: consistently prove as you go that
sub-theory projections induce isomorphisms between sets of terms up to provable equality; then, when proving an equality between terms in a complex theory,
manually reduce the problem to proving equality between terms in a sub-theory, and then appeal to a known characterization of equality of terms in the sub theory.

Patrick Nicodemus (Sep 07 2025 at 18:25):

More lengthy/formal discussion
A big and vague problem is how to make it easier to prove coherence theorems in category theory. We want to make it more feasible to prove complex theorems on pen and paper, and formalize theorems in a theorem prover.

Sometimes I think about this problem in terms of general proof-search, but undirected proof-search guided by heuristic methods feels theoretically unsatisfying. In addition, from a software engineering perspective it is unstable - what if you change the heuristics of your proof search and you can no longer prove the same theorems you could before?. For this reason I think a lot about how to establish decidability of equations in a theory rather than searching for a better semi-decision procedure for equality. We might hope to do this automatically, for example, to construct decision procedures for equality such as by a Knuth-Bendix completion technique.

Yesterday I had an idea which I think is mathematically interesting and maybe practically feasible to carry out in a theorem prover.
I restrict my attention to generalized algebraic theories, which model a large and interesting part of category theory. GAT's I might want to consider are:

A GAT is similar to a kind context in type theory, it specifies types, dependent types and functions between these types. A morphism of GAT's can be defined similarly to a morphism of contexts in type theory, which gives interpretations. I can be precise about this but it is somewhat complicated. But we should have, for example, two interpretation morphisms

\textbf{Functor}\to \textbf{Cat}

which correspond to the fact that

\textbf{Cat}

occurs as a sub-theory of

\textbf{Functor}

in two different ways. Similarly there are two distinguished morphisms

\textbf{NatTrans}\to \textbf{Functor}

We will not regard equality as fundamental here, it will just be a distinguished binary relation which is symmetric, transitive, reflexive, and all other functions involved are homomorphisms with respect to it.

Associated to any GAT there is a category of type contexts over the GAT, and that has associated a presheaf of terms of that type context. So, associated to a GAT there is a (syntactic) category with families which explains what are the types and terms we can define over the given context. This extends to a contravariant functor GAT^op -> CWF. These CWF's do not have any kind of

\lambda

-abstraction or product / coproduct formation rules, they are entirely first-order; function symbols are only permitted to live in the GAT itself, not the type context.

I conjecture that given a GAT

\Xi

and a term context

\Gamma

, the types and terms over

\Xi,\Gamma

are presentable by a "first-order indexed inductive type", which is an inductive type with no parameters whose arity recursively consists of first-order indexed inductive types. Below I sketch what I mean by this: given

\Xi

= the theory of a category, and

\Gamma = w,x,y,z: Ob, f: w\to x, g : x\to y, h: y\to g

, I can present the types and terms as such:

Set Implicit Arguments.
Inductive Type0 := Ob : Type0.
Inductive tm0 : Type0 -> Set :=
| w : tm0 Ob
| x : tm0 Ob
| y : tm0 Ob
| z : tm0 Ob.
Inductive Type1 : forall (O : Type0), tm0 O -> tm0 O -> Set := Arr : forall (x y : tm0 Ob), Type1 x y.
Inductive tm1 : forall (O : Type0) (x y: tm0 O), Type1 x y -> Set :=
| f : tm1 (Arr w x)
| g : tm1 (Arr x y)
| h : tm1 (Arr y z)
| id : forall (x : tm0 Ob), tm1 (Arr x x)
| comp : forall (x y z : tm0 Ob), tm1 (Arr x y ) -> tm1 (Arr y z) -> tm1(Arr y z).

A solution to the coherence problem should be some explicit characterization of the quotient

A_i/R_i

for all

i

. For example, returning a list of

n_i

elements of

A_i

together with a proof that every element of

A_i

relates to exactly one of these should be a solution. If it is necessary to generalize to the case where

A_i/R_i

may be infinite, we could perhaps say that an "explicit characterization" is given by an isomorphism between

A_i/R_i

and a type family explicitly definable in terms of

i

Now, this is all plausible sounding definitions but the only insight it gives into coherence problems so far is the insight that when talking about "decision procedures for equality of morphisms" it is probably more elegant and workable to use a presentation of the problem using inductive types rather than the more general setting of an arbitrary Turing machine. Here is the main theoretical insight I have: Many interesting and complex coherence problems

(\Xi, \Gamma, \{A_i\}_{i\in I}, \{R_i\}_{i\in I}

are equipped with projections onto simpler theorems

(\Xi', \Gamma', \{B_j\}_{j\in J}, \{ S_j \}_{j\in J})

in the sense of having a morphism of GAT's

\Phi : \Xi\to\Xi'

, a context morphism

\Psi : \Gamma\to\Phi^\ast(\Gamma')

, and

f : J\to I

such that

A_{f(j)} = (\Phi,\Psi)^\ast B_j

for all

j

, and

S_j \rightarrow R_{f(j)}

for all

j

. It is then immediate that

(\Gamma,\Phi^\ast(\Gamma')

induces contravariantly a family of maps

B_j/S_j \to A_{f(j)}/R_{f(j)}

for all

j

. For example, if we consider the theory

\Xi

of a functor

F : C\to D

together with context

\Gamma = x, y : C, f : x\to y

, then if we regard the category

D

as an embedded sub-theory of

(C,D,F)

and regard the context

\Gamma' = d_0, d_1 :D, g : d_0\to d_1

as interpreted by

d_0 = F(x), d_1 = F(y), g = F(f)

, then we can prove the equation

F(f) \circ id_{F(y)} = F(f)

using only the category laws for

D

, and this equation is still valid in the functor theory

(C, D, F)

; the interpretation map interprets the equality proofs in particular.

What is really interesting is that often the induced functions

B_j/S_j \to A_{f(j)}/R_{f(j)}

are bijections: even though the super-theory adds new terms, it also adds new equations constraining the terms. For example, in the situation above, the only definable morphism from

d_0\to d_1

in the context of

D

is just

g

, all other definable morphisms (here we are using the formalism of GATs to give us a precise meaning of "definable".)

I've rambled a bit long but one practical takeaway idea here is supposed to be: when designing a formal library for category theory, aggressively prove many little decision procedures for type inhabitation and morphism equality over a GAT context, proving their correctness by induction on the set of all terms of the GAT; the key idea is that once you write a lot of these it should start to get easier, because most of the cases of the induction should reduce to cases that are already known because this is a sub-theory.

Morgan Rogers (he/him) (Sep 09 2025 at 16:38):

@Jonas Frey might be interested in this, but I'm not sure what kind of response you're after @Patrick Nicodemus ?

Patrick Nicodemus (Sep 09 2025 at 18:04):

Links to literature that seems like it might be related, and comment on whether the ideas resonate or fail to resonate.

I would be interested if anybody has thought about decision procedures for equality or type inhabitation in generalized algebraic theories, given my observation that a lot of category theory can be formulated in GAT's (and moreover a sort of acyclic fragment of GAT's as we tend not to have functions from, say, arrows to objects). Restrictions on GAT'S that make them amenable to study, taking the form of contractibility constraints.

Then regarding the second point I am interested in whether others have commented on this concept I am talking about, namely the idea that when one categorical theory S occurs as a sub-theory of another T, we often have a situation that all newly definable terms t in T which live in a type sigma which is already present in S, are in fact equal to some s in sigma which is already definable in S. Moreover s and s' are provably equal in S iff they are in T. Together we might say metaphorically that the embedding of S into T is fully faithful. Does this strike anyone else as curious? I tried above to make my ideas rigorous but at a non rigorous level it is a bit like, say, how manifolds are colimits of open subspace embeddings and not arbitrary maps. We know that contractibility is an important idea when talking about coherence conditions, for example we prove that the operad for monoidal categories is contractible. What I'm asking is, has anyone looked into this idea that nice, good, coherent categorical structures subject to coherence axioms (like functors, natural transformations and grothendieck fibrations) tend to be somehow glued out of subtheories in a way that preserves unique definability in the embedded subtheories?

I'll give another example just because I'm not sure the idea is coming across. Let C, D be categories, let F : D -> C be a functor. Let c be an object in C and d an object in D. Let f: c->Fd. Then the only definable morphism c->Fd is f, and all the formalism above is just meant to give precise meaning to "only definable morphism".

Now extend the theory by introducing a universal arrow era (d', c -> Fd'). It's still the case that f is the only definable morphism from c to Fd up to equality. That hasn't changed. The extension of the theory by a "good, coherent" concept (universal arrow) has preserved the number of definable terms c-> Fd. I just find that interesting.

Patrick Nicodemus (Sep 09 2025 at 18:11):

Actually this is true for all homsets definable in the original diagram I think.

Ryan Wisnesky (Sep 09 2025 at 21:44):

At categoricaldata.net we know a lot about decision procedures for algebraic theories and a lot about model completion procedures ("chase algorithms") for generalized algebraic theories (given a non-model of a GAT, repair the non model to be a model in a universal way).

Patrick Nicodemus (Sep 10 2025 at 01:37):

Thank you Ryan. I appreciate that. I skimmed through the "left Kan extensions by chase" paper and it looks interesting but it will take me some time to digest it. Do you have examples of decision procedures you can point to so I can get an idea of this?

Patrick Nicodemus (Sep 10 2025 at 01:38):

Ryan Wisnesky (Sep 10 2025 at 04:54):

Nathan Corbyn (Sep 10 2025 at 08:27):

I think you’re more or less proposing to lift these ideas (in the special case of normalisation) to the generalised algebraic setting

Nathan Corbyn (Sep 10 2025 at 08:28):

I’ve thought a lot about this and, for normalisation, this works almost identically but, for partial evaluation, the story becomes quite complicated