Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: learning: questions

Topic: call stacks and lenses


view this post on Zulip Nathaniel Virgo (Mar 09 2022 at 02:29):

This is kind of a weird question - I'm not sure how to really make it precise. But what do we actually mean when we write something like

f(x) = g(x) + 1

or

fact(x) = if x==0 then 1 else x*fact(x-1)

?

The first one kind of seems to make sense in terms of function composition - I can see the function ff as gg post-composed with a function that increments its argument.

But the second one is a bit weirder. We could see the definition as an equation for the function fact, which the compiler solves in order to work out what the function is. But in practice, at least conceptually, that's not really what happens. Instead, when we call fact with a positive argument, the function gets called multiple times until we hit the base case x=0, and then values get passed back up the call stack again in order to generate the final result.

This has a very lens-like feel - each invocation of fact gets an input (x), then passes a "forward" output (x-1) to the next function, then gets the result from that and passes something "backwards" up the stack that depends both on its original input and on the result that was passed bacwards to it (x*fact(x-1)).

So really my question is, do function calls and call stacks have something to do with lenses in some way? Or maybe with dependent lenses, since recursion might give rise to a branching tree instead of just a linear chain.

Has anything like that been worked out? Or if not, is there some other way to get a handle on "what's really going on" with recursion and call stacks from a category-theoretic point of view?

view this post on Zulip Morgan Rogers (he/him) (Mar 09 2022 at 08:53):

This is an inductive definition, and it makes sense/is well-defined because the natural numbers are an inductive type. That is, we can define construct the natural numbers with an introduction rule for 0 and an inference rule that gives us x+1 from x. As a consequence, we can define a function on the natural numbers inductively (the natural numbers are formally taken to be the smallest set of expression in the language built from the alphabet 0 and S, say, which is closed under the rules of the inductive definition, so it is necessary and sufficient to explain what happens to the output of these rules given the value on their input).

That might not necessarily be helpful, but the takeaway is that whatever you can say about this example using lenses should extend to inductive definitions more broadly.

view this post on Zulip Matteo Capucci (he/him) (Mar 09 2022 at 10:11):

I guess what you're describing is continuation-passing style (CPS), which is kind of lensy indeed.
AFAIU, in CPS returns are replaced with calls to continuations, for instance

fact(x, k) = if x == 0 k(1) else fact(x-1, λy. k(x * y))

Such a function is in fact a monadic function (= a map in the Kleisli category) for the [[continuation monad]] KN=[[,N],N]K_{\mathbb N} = [[-, \mathbb N], \mathbb N], which indeed amounts to a map X×[Y,N]NX \times [Y, \mathbb N] \to \mathbb N (for fact, X=Y=NX=Y=\mathbb N).
This monad is related to JN=[,N],]J_{\mathbb N} = [-, \mathbb N], -] (the selection monad) by a monad morphism :JNKN-^\dagger : J_{\mathbb N} \to K_{\mathbb N}. I learned about these monads from Escardo' and Oliva (they actually wrote many papers doing very cool things using this technology, so I suggest a quick scholar search to the interested reader).

Anyway, I'm telling you this because we've recently proven that lenses (X,S)(Y,R)(X,S) \to (Y,R) induce morphisms JS(X)JR(Y)J_S(X) \to J_R(Y) (this is even true in a bit more generality!). So there's definitely something going on.

Moreover, the backward part of a lens (f,f):(X,N)(Y,N)(f, f^\sharp) : (X, \mathbb N) \to (Y, \mathbb N) is f:X×NNf^\sharp : X \times \mathbb N \to \mathbb N, so not quite the same as fact:X×[Y,N]Nfact : X \times [Y, \mathbb N] \to \mathbb N, but of course if you have the latter you get the first by:

X×[Y,N]ΔX×[Y,N]X×X×[Y,N]X×f×[Y,N]X×Y×[Y,N]X×evalX×NfNX \times [Y, \mathbb N] \overset{\Delta_X \times [Y, \mathbb N]}\to X \times X \times [Y, \mathbb N] \overset{X \times f \times [Y, \mathbb N]}\to X \times Y \times [Y, \mathbb N] \overset{X \times \mathrm{eval}}\to X \times \mathbb N \overset{f^\sharp}\to \mathbb N

which is definitely something that deserves more attention...

view this post on Zulip Matteo Capucci (he/him) (Mar 09 2022 at 10:12):

(tagging @Jules Hedges here, and @Bruno Gavranovic which recently entertained a version of lenses/optics where the residual is actually a whole continuation)

view this post on Zulip Graham Manuell (Mar 09 2022 at 15:33):

I don't know about the link to lenses, but recursive function definitions have been thoroughly studied in domain theory.

view this post on Zulip Nathaniel Virgo (Mar 15 2022 at 02:43):

I'm not sure I've managed to make sense of this yet, but here's one thing we can note, following on from Matteo's post.

I found it a bit odd that the return type (N\mathbb{N} in this case) is baked in to the continuation monad. It looks kind of like lenses, but every lens has to be (XN)(YN)\begin{pmatrix}X\\ \mathbb{N}\end{pmatrix}\to \begin{pmatrix}Y\\ \mathbb{N}\end{pmatrix} for some X,YX,Y. We can get rid of that by defining a more lens-like category instead, as follows:

Xf((YB)A)(gB)A((((ZC)B)B)A)X\xrightarrow{f} \big((Y\to B)\to A\big) \xrightarrow{(g\to B)\to A} \big( (((Z\to C)\to B) \to B)\to A\big)

λϕ.λh.ϕ(λk.k(h))((ZC)A),\xrightarrow{\lambda \phi.\lambda h.\phi(\lambda k.k(h))} \big((Z\to C)\to A\big),

where the types in that lambda expression are ϕ ⁣:(((ZC)B)B)A\phi\colon (((Z\to C)\to B)\to B)\to A, h ⁣:ZCh\colon Z\to C and k ⁣:(ZC)Ak\colon (Z\to C)\to A. (Maybe this can be simplified somehow.)

The composition rule is very similar to Kleisli composition in the continuation monad (which Jules explained here), just generalised so that AA, BB and CC can be different types. The identity morphisms are similarly reminiscent of the unit of the continuation monad.

Having written that down I still don't have much intuition for what a function of type X(YB)AX \to (Y\to B)\to A really is or how it relates to call stacks and recursive functions. It will come to me eventually. But the last equation in Matteo's post shows that this is a generalisation of lenses, in that if we have a lens we can convert it into one of these things. Presumably that forms a functor by which lenses embed into this category.

One question I have is how this would be combined with effectful computation. One thing I'm interested in is probabilistic programs, where we might want to define a probability distribution by sampling recursively. (Simple example: flip a coin until it comes up tails and count the number of heads.) Is there a monadic version of the continuation monad, or a version that would work in a Markov category somehow, instead of a Cartesian closed one?

view this post on Zulip Matteo Capucci (he/him) (Mar 15 2022 at 09:17):

Very cool!

view this post on Zulip Matteo Capucci (he/him) (Mar 15 2022 at 09:18):

I wonder if morphisms shouldn't also have a 'forward' part, something of type XYX \to Y?

view this post on Zulip Matteo Capucci (he/him) (Mar 15 2022 at 09:22):

Also have you heard of grates? My gut feeling says they are related, perhaps dual

view this post on Zulip Nathaniel Virgo (Mar 15 2022 at 09:29):

I think the forward part is sort of secretly already there.

Or rather, if you have a lens consisting of f ⁣:XYf\colon X\to Y and f ⁣:X×BAf^\sharp\colon X\times B \to A then you can form a function X×(YB)AX(YB)AX\times (Y\to B)\to A \cong X\to (Y\to B)\to A using the same manoeuvre you did:

X×(YB)ΔX×(YB)X×X×(YB)X×f×(YB)X×Y×(YB)X×evalX×BfAX\times (Y\to B) \xrightarrow{\Delta_X\times (Y\to B)} X\times X\times (Y\to B) \xrightarrow{X\times f \times (Y\to B)} X\times Y\times (Y\to B) \xrightarrow{X\times \operatorname{eval}} X\times B \xrightarrow{f^\sharp} A

So the information about both ff and ff^\sharp is already included in the function X(YB)AX\to (Y\to B)\to A.

view this post on Zulip Nathaniel Virgo (Mar 15 2022 at 09:29):

I've heard of grates but I don't know anything about them - I'll have to look into it.

view this post on Zulip Matteo Capucci (he/him) (Mar 15 2022 at 09:48):

Well you surely can go from 'these things' (please christen them with your favorite variety of cut glass - I propose periscopes) but I don't see how you can define a lens given one of those. In X(YB)AX \to (Y \to B) \to A, YY is in negative position whereas in XYX \to Y is in positive position.

view this post on Zulip Matteo Capucci (he/him) (Mar 15 2022 at 10:34):

Btw, if we take seriously that X(YB)AX \to (Y \to B) \to A is the backwards part of a mixed optic, then you can read off the action generating it, namely :=×[,=]:C×Cop×CC\bullet := - \times [-, =] : \mathcal C \times \mathcal C^{op} \times \mathcal C \to \mathcal C.
Suppose C\mathcal C is cartesian closed. Then

(X,Y)(Z,W)B=(X,Y)(Z×[W,B])=X×[Y,Z×[W,B]]=X×[Y,Z]×[Y,[W,B]]=X×[Y,Z]×[Y×W,B].(X, Y) \bullet (Z, W) \bullet B = (X,Y) \bullet (Z \times [W, B]) = X \times [Y, Z \times [W, B]] = X \times [Y, Z] \times [Y, [W, B]] = X \times [Y,Z] \times [Y \times W, B].

Hence in order to make \bullet into an action, you have to equip the acting category C×Cop\mathcal C \times \mathcal C^{op} with the product:

(X,Y)(Z,W)=(X×[Y,Z],Y×W)(X,Y) \odot (Z,W) = (X \times [Y,Z], Y \times W)

I'd call this exponential affine, since it's basically the same as the product used to get affine traversals, which is the composition product of coefficients of affine functions ((X,Y)(Z,W)=(X+Y×Z,Y×W)(X,Y) \odot (Z,W) = (X + Y \times Z, Y \times W)).
For \odot to be associative, it suffices to have a distributive law [X,Y×Z][X,Y]×[X,Z][X, Y \times Z] \cong [X, Y] \times [X, Z], which we indeed have by cartesianity of ×\times.
Now a bit of coend calculus (basically following Prop 3.26 here) yields:

(X,A)(Y,B)=(M,N)C(X,M×[N,Y])×C(M×[N,B],A)C(X×[[X,Y],B],A)(X,A) \to (Y,B) = \int^{(M,N)} \mathcal C(X, M \times [N, Y]) \times \mathcal C(M \times [N, B], A) \cong \mathcal C(X \times [[X,Y], B], A)

view this post on Zulip Matteo Capucci (he/him) (Mar 15 2022 at 10:40):

To get C(X,Y)×C(X×[Y,B],A)\mathcal C(X, Y) \times \mathcal C(X \times [Y,B], A) seems quite hard... I'll have a thought about it

view this post on Zulip Nathaniel Virgo (Mar 15 2022 at 13:43):

I think "continuation lenses" would be a logical name, but I like "periscopes" also. I agree that you can't get a lens from a periscope, that makes sense.

view this post on Zulip Nathaniel Virgo (Mar 15 2022 at 13:48):

(I'll have to think about the rest properly later on, it's late here)

view this post on Zulip Jules Hedges (Mar 16 2022 at 15:39):

Lots of interesting stuff to to think about here! I've considered this exact category a little bit before, but I never wrote anything down and I don't remember what I concluded about it, if anything

view this post on Zulip Nathaniel Virgo (Mar 19 2022 at 13:38):

I have a bunch more thoughts on this, but it might take me a while to get them organised / worked out enough to write down. For now I just want to note that the composition rule for periscopes can be simplified a bit. If we have f ⁣:X(YB)Af\colon X\to (Y\to B)\to A and g ⁣:Y(ZC)Bg\colon Y\to(Z\to C)\to B, then their composite (f;g) ⁣:X(ZC)A(f;g)\colon X\to (Z\to C)\to A turns out to be the function

λx.λh.f(x)(λy.g(y)(h)),\lambda x.\lambda h. f(x)(\lambda y.g(y)(h)),

where the types are x:Xx:X, y:Yy:Y and h:ZCh:Z\to C. Which I guess is what you'd get from "type-driven programming", but I got to it by simplifying the definition I gave above.

view this post on Zulip Bruno Gavranović (Mar 21 2022 at 17:39):

I just had a chat with @Matteo Capucci (he/him) about this, where he explained to me that this structure is a composition of a lens and a grate.

But this kind of thing already has a name: glass. It's defined in Profunctor Optics, a categorical update (Section 3.3.2)

view this post on Zulip John Baez (Mar 21 2022 at 18:14):

I hope Google develops this and calls it Google Glass.

view this post on Zulip Nathaniel Virgo (Mar 22 2022 at 03:27):

Bruno Gavranovic said:

But this kind of thing already has a name: glass. It's defined in Profunctor Optics, a categorical update (Section 3.3.2)

Thanks! I find that paper really hard - I never seriously coded Haskell so the motivation is really different from where I'm coming from - so I've never gone through it.

Is it really the same though? Their definition looks to me like (Y×((YX)A))B(Y\times((Y\to X)\to A))\to B (or maybe it should be (X×((XY)B))A(X\times((X\to Y)\to B))\to A?), which seems related to X(YB)AX\to (Y\to B)\to A but not obviously the same.

view this post on Zulip Nathaniel Virgo (Mar 22 2022 at 03:52):

Here's the thought I keep having, which might be related. But I am also not sure if it will make sense to anyone besides me.

Take a periscope (glass?) X(YB)fAX\to (Y\to B)\xrightarrow{f} A. The input type X×(YB)X\times (Y\to B) looks kind of like a decision problem, if we think of BB as a utility. You get an input in XX and a map from outputs YY to utilities BB. The thing we return, AA, looks like a coutility being passed back to a previous player.

But this is a bit weird, because we're taking a decision problem as input and passing back a coutility, but we're skipping over the part where a decision is actually made. I think this is why Matteo says they should also have a forward map XYX\to Y. But what makes more sense to me is if the decision in YY is a function of a the decision problem that we receive as input. In that case it should be X(YB)YX\to (Y\to B)\to Y. Then there should also be a map XYBAX\to Y\to B\to A saying how the coutility depends on the input, output and utility, and we should demand that the function ff factors through these somehow. I didn't work out the details of that yet though. (I was actually trying to do something a bit more ambitious, involving a monadic version, so that it would let us talk about probability.)