Using polynomial functors to organize software · theory: applied category theory

Hi all!
I like the "Polynomial Functors: A Mathematical Theory of Interaction" book and related papers quite a bit. My day job is writing OCaml, and I've spent some time thinking about the ways in which those concepts can be applied to organize code in "normal" (e.g., non-dependently-typed) functional programming languages.

In terms of concrete questions: A number of constructions in Poly can be encoded in normal functional programming languages like OCaml/Haskell. If you start with something like this:

module type Poly = sig
  type 'a t

  val map : 'a t -> f:('a -> 'b) -> 'b t
end

class Poly p where
  pmap :: (a -> b) -> p a -> p b

(where in both cases we're leaving conditions about functoriality and polynomial-ness unchecked) then it's straightforward to define coproducts, cartesian products, and substitution products. (E.g., in ocaml the cartesian product of 'a P.t and 'a Q.t is just 'a P.t * 'a Q.t and the substitution product is 'a P.t Q.t.)

But it also turns out that other constructions are definable, just in a not-as-obvious-to-me way. For example, we can define the cartesian closure, Dirichlet product, and Dirichlet closure. E.g., the Dirichlet product is:

(* p ⊗ q *)
module Dirichlet_product (P : Poly) (Q : Poly) : sig
  type _ t = T : 'a P.t * 'b Q.t * ('a -> 'b -> 'c) -> 'c t

  include Poly with type 'a t := 'a t
end = struct
  type _ t = T : 'a P.t * 'b Q.t * ('a -> 'b -> 'c) -> 'c t

  let map (T (p, q, f')) ~f = T (p, q, fun a b -> f (f' a b))
end

{-# LANGUAGE GADTs #-}
-- p ⊗ q
data DirichletProduct p q c where
  T :: p a -> q b -> (a -> b -> c) -> DirichletProduct p q c

instance (Poly p, Poly q) => Poly (DirichletProduct p q) where
  pmap f (T p q f') = T p q (\a b -> f (f' a b))

(* [p,q] *)
module Dirichlet_closure (P : Poly) (Q : Poly) : sig
  type 'a t = { f : 'b. 'b Q.t -> ('a * 'b) P.t }

  include Poly with type 'a t := 'a t
end = struct
  type 'a t = { f : 'b. 'b Q.t -> ('a * 'b) P.t }

  let map { f } ~f:g =
    { f = (fun q -> P.map (f q) ~f:(fun (a, b) -> (g a, b))) }
end

-- [p,q]
newtype DirichletClosure p q a = DirichletClosure (forall b. q b -> p (a, b))

instance (Poly p) => Poly (DirichletClosure p q) where
  pmap f (DirichletClosure g) = DirichletClosure (\q -> pmap (\(a, b) -> (f a, b)) (g q))

(* P ⊗ [P,Q] → Q *)
module Dirichlet_eval (P : Poly) (Q : Poly) = struct
  let eval : type a. a Dirichlet_product(P)(Dirichlet_closure(Q)(P)).t -> a Q.t
      =
   fun (T (p, { f }, combine)) -> Q.map (f p) ~f:(fun (y, x) -> combine x y)
end

-- p ⊗ [p,q] → q
dirichletEval :: (Poly q) => DirichletProduct p (DirichletClosure q p) a -> q a
dirichletEval (T p (DirichletClosure f) combine) = pmap (\(y, x) -> combine x y) (f p)

But some constructions I haven't been able to work out how to define. For example, I don't know how to define the left coclosure in ocaml/haskell. Does anyone know if it's possible (or what framework to use to think about the question)? Perhaps relatedly, the left coclosure doesn't feel as useful to me in "pragmatic" programming, but maybe I'm wrong about that as well?

That also brings me to my "squishier" question: The concepts from Poly and related notions seem like they have a lot of potential in organizing production code. In my day job, I find that one of the most important concerns is separating the structural properties of your code from the business logic, so that each can be examined independently. For code which behaves at all like a state machine, the relevant notion of "structural property" often seems to just be what you get by thinking about it in terms of Poly.

Of course, thinking along these lines is less about theorems and more about ergonomics and context. Anyway, I'm just curious if anyone else is thinking about applying Poly in this way?

Alex Gryzlov (Apr 13 2026 at 19:59):

If you want the code to be checked by a machine to ensure that it is indeed organised according to these properties, some "abnormality" in your programming language (dependent/refinement typing in particular) is largely inevitable. Otherwise, you would need to construct some sort of ad-hoc symbolic reasoning algorithm and incorporate it into the type-checking phase of the language you are using.

Michael O'Connor (Apr 13 2026 at 21:53):

It's not so much that I'm interested in a modification of OCaml/Haskell that makes, e.g., DirichletProduct formally verifiable, but more that I think that developing a toolbox of things like DirichletProduct can help users to declare the important structural properties of their code in the types.

For example, if you have a map

p \to q

in Poly, this says much more than that you have a way of transforming state machines with interface

p

to interface

q

and the fruitfulness of Poly suggests that it might also be fruitful for coders to be aware of when their code has this extra property.

And in fact you can write code in this style. In OCaml, rather than have two static interfaces P.t and Q.t with a map P.t -> Q.t, you would have module P : Poly and module Q:Poly with a polymorphic function val f : 'a P.t -> 'a Q.t, where all the extra guarantees come from the universal quantification of the 'a.

This way of declaring the structural behavior of your code in types is very useful: E.g., it makes it easy to see if someone (perhaps a colleague, perhaps AI) changes it.

Ryan Wisnesky (Apr 13 2026 at 23:13):

is it intentional that your Haskell Poly class is the exact same as the Haskell Functor class?

Michael O'Connor (Apr 13 2026 at 23:18):

Yeah, it’s the same as Functor in terms of what’s actually defined, but it’s supposed to convey the additional convention that the functor be polynomial (similar to how the Haskell Functor typeclass doesn’t actually verify the laws of functoriality, it’s just a convention)

Ryan Wisnesky (Apr 14 2026 at 02:59):

I think you might need that extra poly structure to define the various operations you want; for example, dirichlet closure seems like it makes sense for any functor, not just a polynomial functor. perhaps left co-closure can't be defined for arbitrary functors, but requires the extra structor of poly that has been forgotten by treating Poly as Functor in Haskell, and that's why it is hard to write?

Jules Hedges (Apr 14 2026 at 12:15):

imo the right tool for the job here is containers. A container is a pair

(X, X')

of a set

X

and an indexed family of sets

X' : X \to \mathbf{Set}

; a morphism

(X, X') \to (Y, Y')

is a pair of a function

f : X \to Y

and a function

f' : (x : X) \to Y' (f (x)) \to X' (x)

(aka a "dependent lens"). The category of containers and dependent lenses is isomorphic to

\mathbf{Poly}

(the proof involves Yoneda but the basic idea is easy:

(X, X')

corresponds to the polynomial

\sum_{x : X} y^{X' (x)}

)

The difference is that the category of containers and dependent lenses is locally small by construction which makes them much more convenient to program with in practice, you no longer have to rely on parametric polymorphism to enforce naturality. You need dependent types in your metalanguage to do this properly (we use Idris), but you can get a lot of mileage out of doing it in a non dependently typed language by replacing dependent types with their

\Sigma

type and throwing a dynamic error if you end up in the wrong fibre. Either way, we know that you can do a lot of very interesting stuff in software engineering this way

Michael O'Connor (Apr 14 2026 at 16:32):

@Jules Hedges : Oh, that's very exciting that you already know you can do interesting stuff in software engineering that way. Is there some place I can read more about those efforts?

One thing that does give me pause with a representation such as I think you're suggesting is performance. E.g., in

\mathbf{Poly}

, the polynomial functors

A^B y^B

and

(Ay)^B

are equivalent (e.g., a state machine which gives you a

A^B

and which takes in a

B

to give you a new state is equivalent to one which takes in a

B

and gives you an

A

and a new state). However, computationally they're different because typically the computation that produces the

A

and the one that produces the new state will share a lot of computational work, and forcing your code to use a

A^B y^B

representation seems like it might force poor performance.

(Of course, another thing that gives me pause is that because I'm using OCaml rather than Idris, there will have to be dynamic error checking as you said.)

Still both of those things could be surmountable, and I'm very interested for further pointers along these lines!

Another thread is related to what @Ryan Wisnesky pointed out, which is that the definitions I gave don't actually ever really rely on polynomial-ness (this is related to the previous point, because I wanted to preserve the ability to work with things like

(Ay)^B

without having to go through a

A^B y^B

flattening).

This is getting way outside of my competence, but my understanding is that the categorical way to understand existential/universal types is through coends/ends. I think that suggests that (e.g.) the DirichletProduct code I gave above has a counterpart in terms of a construction on

\mathbf{Set}^\mathbf{Set}

that extends the Dirichlet product on

\mathbf{Poly}

, namely for functors

F

and

G

\mathbf{Set}^\mathbf{Set}

we have some

H

\mathbf{Set}^\mathbf{Set}

given by:

H(C) = \int^{A,B\in\mathbf{Set}} F(A)\times G(B)\times \mathbf{Set}(A\times B, C)

But I don't know if that's actually well-defined or what properties it might have.

Still it suggests an answer to one of my questions: Presumably, if the left coclosure operation on

\mathbf{Poly}

can't similarly be extended to

\mathbf{Set}^\mathbf{Set}

, that's a strong indication that it can't actually be defined in OCaml/Haskell in the way that I was attempting to do.

(Re-reading @Ryan Wisnesky 's comment, I think I'm basically just re-stating it in more words here...)

Alex Kreitzberg (Apr 14 2026 at 17:23):

@Michael O'Connor Here A container, being a pair of an indexing type with a type family

(A, p : A \rightarrow \textrm{Type})

, could be instantiated as an upper bound

n : \mathbb{N}

and some labels

\{0, 1, \ldots, n-1\}

. That could easily be stored as a vector that tracks its length.

Then the container extension

\sum_{a : A} X^{p(a)}

, what most folks think of as the container type in programming, has polymorphic functions with other container extensions.

The point is permutations of this in memory vector, are exactly the same as polymorphic functions from the container extension to itself, and simple mappings from this vector to other containers again correspond to polymorphic functions between containers. The yoneda lemma lets you down step fancy idris style generic programs into something much simpler.

For fun, I was actually planning on trying to see how far I could fake dependent types in C by working with containers, as opposed to their container extensions. Hedges comment gives me confidence this approach will work.

Edit: I'm in a rush, so I crushed those details a bit to the point of being false - the point is "container" refers to a specific definition underlying a polynomial functor, what folks normally think of as containers - and this seed inside "the big thing" is much easier to store in memory efficiently.

Michael O'Connor (Apr 14 2026 at 17:46):

@Alex Kreitzberg : Are you always going to have an upper bound

n : \mathbb{N}

in your use case?

Typical state machines that I deal with might output a string and take in a string, so have interface

\mathtt{string}\cdot X^{\mathtt{string}} = \sum_{s:\mathtt{string}} X^{\mathtt{string}}

, but you definitely wouldn't want to have an in-memory vector with a location for every possible string (if I'm understanding you correctly)?

Nelson Niu (Apr 14 2026 at 18:18):

What you’re noticing here is that the Dirichlet product on Poly is actually the restriction of the Day convolution on Set^Set. This Day convolution is in fact closed, and its closure restricts to Poly as well (i.e. the closure of two polynomials is a polynomial). So none of that requires you to assume a priori that the functors are polynomial, which is why you can generalize to functors Set -> Set (or analogously to Functors in Haskell).

Left coclosures for the composition product in Poly are also generalizations of an even more fundamental concept: left Kan extensions in Set^Set. Once again, the left Kan extension of a polynomial over a polynomial is still a polynomial, which is why everything works in Poly. But everything should also work in Set^Set; there is nothing special about polynomial functors that give left coclosures. That said, I don’t personally know how they would be implemented in Haskell, but I would check out any packages for left Kan extensions (and for Day convolutions, to compare your work) already in Haskell.

Nelson Niu (Apr 14 2026 at 18:22):

One more thing: the coend formula you gave is exactly right, and it is indeed the formula for the Day convolution of two functors Set -> Set.

Michael O'Connor (Apr 14 2026 at 22:13):

Ah, thanks @Nelson Niu : that's very helpful! I do in fact see that there is a kan-extensions Haskell package which includes a Data.Functor.Day module equivalent to the DirichletProduct module I was working with, neat!

data Lan g h a where
  Lan :: (g b -> a) -> h b -> Lan g h a

module Lan (G : Poly) (H : Poly) = struct
  type _ t = T : ('b G.t -> 'a) * 'b H.t -> 'a t
end

Evidence that this is in fact the coclosure is given by the fact that if G and H are constant (e.g., type 'a G.t = g and type 'a H.t = h), this does in fact reduce to h * (g -> 'a) or the polynomial

h y^g

as you'd expect. Neat!

This provides some evidence (at least in my mind) that the coclosure should be useful in some way to pragmatic programming, I'll have to think a bit on how that works...

Nelson Niu (Apr 14 2026 at 22:21):

Alex Kreitzberg (Apr 15 2026 at 01:44):

Since Niu is commenting I kind of want to give space for you to chat with them, I'm just getting used to this stuff. But my point was that implementing morphisms between containers is usually going to be more performant than implementing polymorphic functions between their extensions. This is because when implementing polymorphic functions you still have to juggle the data the containers hold. Whereas if you work with the morphisms between the containers you're only calculating with the data's location information. Further, dependently typed features can probably be tracked at runtime (but maybe that defeats the point of this exercise for you).

In your example, if you had a Moore machine

\varphi : Sy^S \rightarrow \mathtt{string} \cdot X^{\mathtt{string}}

a map between the containers is determined by a lens, the two maps

\textrm{update} : S \times \mathtt{string} \rightarrow S

and

\textrm{return} : S \rightarrow \mathtt{string}

. The specific choices of these maps is going to affect how efficient your implementation can be, but it's probably going to be more efficient than polymorphic functions involving a polymorphic data structure indexed by

\mathtt{string}

So in my example I rushed through, a

\textrm{List}(X)

can be expressed with dependent types as

\sum_{n : \mathbb{N}} X^{\textrm{Fin}(n)}

. You could reverse a list via the usual recursive polymorphic function, but it's faster to consider the morphism between containers determined by the function

\textrm{rev}_N(n) = N - n

. The point of tracking the upper bound of the possible values is it gives you the data to define

\textrm{rev}

. Closer to my original example, a specific vector

[0, 0, 5]

would take a six element list like

[a, c, d, a, b, d]

[a, a, d]

. Working with the location data like

[0, 0, 5]

and

\textrm{rev}

rather than the actual container data like

[a, c, d, a, b, d]

is usually going to be faster is my point.

Martti Karvonen (Apr 15 2026 at 09:19):

I can't help but wonder if there's some size issues that come into play when trying to do Day convolution on Set^Set. I'm fine assuming enough universes or whatever so that e.g. the existence of Set^Set is not at stake, but I'm still wondering if the result of the coend might be too large and take you outside of Set, without e.g. fixing the domain to be small sets and codomain to be large sets, restricting to small presheaves, or something like that. Or maybe in this case it all works anyway?

Michael O'Connor (Apr 15 2026 at 10:56):

@Alex Kreitzberg : I think I follow what you're saying! I know you were said you were going to implement something like this in C, but let me try to work through your List example in Haskell.

I believe you're saying: rather than use the standard List representation, use something like:

-- The Int passed to `access` should be less than `length`,
-- otherwise `access` will throw an exception.
data ListAsContainer a = ListAsContainer
  { length :: Int
  , access :: Int -> a
  }

as an encoding of

\sum_{n : \mathbb{N}} X^{\textrm{Fin}(n)}

(where we can't represent

\textrm{Fin}(n)

directly in Haskell so we use a runtime check, but that's ok).

data MapBetweenLists = MapBetweenLists
  { f :: forall a. ListAsContainer a -> ListAsContainer a
  }

data MapBetweenLists = MapBetweenLists
  { mapLengths :: Int -> Int
  , mapIndices :: Int -> Int -> Int
  }

Namely, a map forward on lengths and a map backwards on indices (the first Int argument to mapIndices is the length, the second Int argument is the index of the output list, the return Int is the index of the input list.

The first thing I'm not sure is that I think you'd like to replace an Int -> Int function with a vector for performance, but we need infinitely many of them (one for each length), how does that get encoded?

The other point is that (if you're saying what I think you're saying): I think you would often but not always get better performance with this sort of encoding. It's true that a particular rev l call would be faster, but if you have to access the returned container multiple times, it could possibly end up being slower, since an extra indirection is added to each access (whereas if you had reversed the list normally, you'd pay that cost once but then all further accesses would have be cheaper as they'd have no indirection).

Jules Hedges (Apr 15 2026 at 12:25):

I have no intuition at all for relative performance of the 2 representations, it's an interesting question

I think you've noticed something I also noticed, which is that it's surprisingly difficult to find something specific you can do with polynomial functors that can't also be done with arbitrary endofunctors

Jules Hedges (Apr 15 2026 at 12:31):

Personally I find programming with Day convolution of functors more difficult than dependent types, but that is both language dependent and brain dependent

Nelson Niu (Apr 15 2026 at 15:24):

@Martti Karvonen yes, I was absolutely sloppy about size issues here. Possibly using different universes for the domain and codomain is the way to deal with that in general, as you suggest. Practically you probably need to restrict the kind of endofunctors under consideration and check that the natural transformations between them are actually (small) sets: this is what happens when we restrict to polynomials, which is why everything works out there. Likely something similar can be said about the sizes of whatever category of types and functions you’re using to model your favorite functional programming language.

Martti Karvonen (Apr 15 2026 at 15:47):

I suspect things might be nice enough when you restrict to small presheaves (just like all polynomials), instead of having different sizes at the domain/codomain. Anyway, while I often nitpick size issues when they come up, it's because I want to understand how they're dealt with in a given situation

Jules Hedges (Apr 16 2026 at 10:56):

There's not a tonne written down, but the most developed is André's work on backend programming, containers/polynomials as RESTful APIs: https://arxiv.org/abs/2203.15633
(André is writing his PhD thesis on this so there will be a lot more detail at some point this year)

Jules Hedges (Apr 16 2026 at 11:01):

Michael O'Connor (Apr 19 2026 at 13:20):

@Jules Hedges : Thanks for those links, very interesting! I'm interested in following along as more things become public.

To connect the dots with things I've already encountered: I think I would understand the DPara l p r construction from the RESTful APIs paper in

\mathbf{Poly}

as a map

p\to [l, r]

(where

[l, r]

is the Dirichlet closure). It seems also related to

\mathbb{O}\mathbf{rg}

from this paper where objects are polynomials and a map from

p

q

is a coalegebra

S y^S \to [p, q]

. However, that restricts the state to have the same input and output sets (

S

), whereas towards the end it looks like the RESTful APIs paper makes use of the fact that the parameter can have a different output and input interface to handle IO in some way that I don't fully understand yet.

Re: frontend stuff, you might possibly be interested in bonsai. It's in ocaml, so no dependent types, but I think that it's very roughly a Para l p r representation but with some tricks to try to make it more ergonomic and performant; for example, the p type representing state is "existential'd out" with a background framework that manages state for you performantly. I've never used React, but I suspect it's similar to React in that regard (one difference though is that in my understanding state in React is always tied directly to UI components, whereas state in Bonsai is generic).

Category Theory
Zulip Server
Archive

Stream: theory: applied category theory

Topic: Using polynomial functors to organize software

Michael O'Connor (Apr 13 2026 at 17:00):