Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: event: Topos Colloquium

Topic: Valeria de Paiva: "Categorical Explicit Substitutions"

Tim Hosgood (Aug 19 2021 at 16:22):

starting in 40 minutes (really sorry for the late announcement here — i was convinced i'd already posted about it...)

Abstract:

The advantages of functional programming are well-known: programs are easier to write, understand and verify than their imperative counterparts. However, functional languages tend to be more memory intensive and these problems have hindered their wider use in industry. The xSLAM project tried to address these issues by using explicit substitutions to construct and implement more efficient abstract machines, proved correct by construction.

In this talk I recap two results from the xSLAM project which haven't been sufficiently discussed. First, we provided categorical models for the calculi of explicit substitutions (linear and cartesian) that we are interested in. No one else seems to have provided categorical models for explicit substitutions calculi, despite the large number of explicit substitutions systems available in the literature. Indexed categories provide models of cartesian calculi of explicit substitutions. However, these structures are inherently non-linear and hence cannot be used to model linear calculi of explicit substitutions. Our work replaces indexed categories with pre-sheaves, thus providing a categorical semantics covering both the linear and cartesian cases. Our models satisfy soundness and completeness, as expected.

Secondly, we recall a different linear-non-linear type theory, built from Barber and Plotkin DILL ideas, which, like DILL, is better for implementations. Unlike DILL, this type theory, called ILT, satisfies an internal language theorem. Thus we describe ILT, show categorical semantics for it and sketch the proof of its internal language theorem, thus justifying its use in implementations. These results are examples of `(categorically) structured deep syntax', to borrow Hyland's characterization.

Tim Hosgood (Aug 19 2021 at 16:23):

YouTube: https://www.youtube.com/watch?v=Z_gu1r7LNyc
Zoom: https://topos-institute.zoom.us/my/it.admin?pwd=Znh3UlUrek41T3RLQXJVRVNkM3Ewdz09

Jesse Sigal (Aug 19 2021 at 18:50):

Thank you for the great talk! I was unaware of "Categorical Models for Intuitionistic and Linear Type Theory", so now my reading list is extended :)

Valeria de Paiva (Aug 19 2021 at 18:57):

thanks Jesse! do send questions, if you have them!

Kartik (Aug 19 2021 at 19:12):

Indeed, great talk! Just finished watching the stream. I have been working on defining semantics for a quantum language with both classical and substructural core calculi with a clean separation between the two and ILT seems like something I should look into in more detail.

Valeria de Paiva (Aug 19 2021 at 19:24):

Thanks, Kartik! I don't know much about quantum phenomena, but always happy to discuss stuff.

Mike Shulman (Aug 19 2021 at 20:35):

I'm sorry that I missed this talk, but I just watched the video. Very nice! This idea of taking contexts more seriously in the categorical models is definitely something that I feel strongly about too. Although my own approach, at least in the simply typed case, has generally been multicategorical: rather than the contexts being incarnated as objects of some category, there is a multicategory whose objects are single types, and it's just the morphisms that can have multiple types in their domains (and possibly also codomains, if you move to polycategories). The indexed approach has its own virtues, of course (e.g. it seems to generalize better to the dependently typed case). Anyway, I thought maybe I would mention that there is also a multicategorical approach to linear-nonlinear logic, which I recently posted a preprint about. I believe your IL-indexed categories are equivalent to a certain class of LNL multicategories (Examples 3.10 in my preprint), so in this case both approaches are capturing the same information.

Valeria de Paiva (Aug 19 2021 at 23:56):

Many thanks for your message, Mike! I know that your preprint exists and have been meaning to read it for a while, will do that as soon as possible. yes, I must say that the idea of context-handling categories came about after we tried somewhat with the idea of "syntactic multicategories". Starting from Lambek's definition of multicategories (Multicategories Revisited) and wanting to pin down composition of morphisms (to make them closer to the implementation) we were led (back -- in Eike's case), not very willingly (in my case), to the indexed category notions. Thomas Ehrhard's "A Categorical Semantics of Constructions". LICS 1988: is the original motivation for Eike's calculus of constructions semantics, I think. and yes, I also believe that our IL-indexed cats are equivalent to a class of LNL multicategories and that more work will make clearer what do we need exactly from where. I am hoping to find people interested in investigating e.g. SystemF+Linearity, cutdown versions of MLTT + linearity, modal type theories of various stripes, etc. Because I still believe that (well-established) categorical semantics can really help debugging programmers wildly clever uses of syntax. thanks.

Mike Shulman (Aug 20 2021 at 01:23):

Something else, which may be a bit of a digression: I didn't quite follow what your opinion is about Mellies's famous counterexample.

As an outsider, my understanding is that it was a counterexample to strong normalization, i.e. the claim that every reduction sequence built from certain rewrite rules terminates in a normal form. But in practice, does strong normalization really matter? It seems like in practice what we want is a normalization algorithm, i.e. a particular reduction strategy that does always terminate, and why would we care that there are other strategies built from the same "rewrite rules" that don't terminate?

Valeria de Paiva (Aug 20 2021 at 03:35):

yes, this is how I understand it too. Mellies example shows lambda-sigma doesn't satisfy strong normalization. But in practice we don't need strong normalization, at least not for abstract machines and for proof assistants, our intended applications. this was what I tried to say--that the goal that people thinking of rewriting theories gave themselves was (to my mind) the wrong goal.

Mike Shulman (Aug 20 2021 at 14:16):

Thanks! I thought that's what you might have been saying but I didn't quite follow. Do you know of a good reference that explains syntactic explicit substitutions from this correct viewpoint? E.g. proving that a sensible reduction strategy does succeed in normalizing things?

Mike Shulman (Aug 20 2021 at 14:17):

Another odd thought I had today is that explicit substitutions seem a little bit related to normalization by evaluation. If we think of the "evaluation in an environment" step of NbE as pushing a multiple explicit substitution through a term, and give rules saying that it gets stuck when it hits an abstraction and doesn't go further in until the abstraction is applied, at which point it gets augmented by the term being substituted (which could be a variable at readback time).

Mike Shulman (Aug 20 2021 at 14:17):

Does that make any sense?

Valeria de Paiva (Aug 20 2021 at 15:55):

well, on a reference I am biased of course but I think Eike Ritter's on his own on "Characterizing Explicit Substitutions that preserve termination" https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.49.4329&rep=rep1&type=pdf might be a good read.
and it had never occurred to me to think of explicit subs and NbE together or in the same context(daily use of the word), but indeed, now that you mentioned it does make some sense.

Mike Shulman (Aug 20 2021 at 18:52):

Thanks, that's a nice paper! It's phrased in terms of strong normalization too, but after reading I see the two perspectives aren't necessarily that different: we can (partially) describe a "sensible strategy" for normalization by restricting the rewrite rules it is allowed to use.

Mike Shulman (Aug 20 2021 at 18:53):

The first paragraph also connects explicit substitutions to environment machines, which are I think also related to NbE.

Damiano Mazza (Aug 24 2021 at 21:41):

Sorry I see this a bit late, but I'd like to add something concerning explicit substitutions (after thanking Valeria for her talk, which I saw only today!).

It is true that weak normalization is the fundamental normalization property, both from the viewpoint of theory (cut-elimination, consistency of the underlying logical system, etc.) and practice (programming languages, abstract machines, etc.). However, the extra generality of so-called "calculi", as opposed to programming languages with a fixed reduction strategy, has an important foundational role, in particular because it provides the means of studying strategies themselves and state/organize results about them and about their relationship. In these "strategyless" calculi, strong normalization is very natural and it is normal to perceive its failure (or the failure of preserving it) as a sign that something is wrong. (Incidentally, there's even a conjecture by Barendregt, Geuvers and Klop stating that, within the world of PTSs (pure type systems), there's no difference between weak and strong normalization: a PTS is either strongly normalizing, or it has terms having no normal form. IIRC, this has been proved for a relatively wide range of PTSs, but the general statement is still open).

With 30 years of hindsight, I think that trying to "fix" explicit substitution (ES) systems so that they preserve SN was not a "wrong goal", because it ended up pushing the development of better behaved ES calculi in all respects, not just SN. Valeria did not mention it because she focused on a different direction, but a lot has happened since Delia Kesner's 2008 paper cited in the talk. In particular, around 2010 Delia herself and Beniamino Accattoli introduced the linear substitution calculus (LSC), which I find a true turning point in the theory of ESs. (The geneology of this calculus is a bit complex: the LSC first appeared in an RTA 2012 paper by Beniamino alone, but its development started jointly with Delia for a work they published later, with other people, in POPL 2014. Also, the LSC improves on a previous proposal independently made a few years earlier by Robin Milner). As I see it, the LSC is based on two fundamental ideas:

explicit substitutions in the $\lambda$ -calculus come from linear logic: if you write down a term calculus for the $\{\multimap,!\}$ fragment of ILL, however you do it, you're going to see let binders pop up, and these behave very much like explicit substitutions, in the sense that normalization needs commuting conversions. Now, if you take Girard's (call-by-name) embedding of the $\lambda$ -calculus in ILL, you see that $\beta$ -reduction decomposes exactly along the lines of what an ES calculus would do, with the explicit substitutions corresponding to the let binders for exponentials. While I think that this had been noticed immediately after the introduction of linear logic, Delia and Beniamino took it a step further and designed an ES calculus (the structural $\lambda$ -calculus, a precursor to the LSC) by mimicking the behavior of ILL terms which are in the image of Girard's embedding. The upshot is that, since Girard's embedding preserves SN, so does the structural $\lambda$ -calculus: linear logic is the "right" way of implementing explicit substitutions.
Explicit substitutions should behave like boxes in proof nets: the let binders mentioned above correspond to what are known as "exponential boxes" in proof nets. Cut-elimination in proof nets has the amazing property (from the rewriting viewpoint) that it does not need commuting conversions. In terms of explicit substitutions, this means that ESs should not need to "move" through a term in order to be applied, one should be able to apply them just where they are created. Rewriting should happen "remotely" or, as they say, "at a distance". Beniamino and Delia noticed that there's a consistent way of generalizing redexes in a term calculus with ESs (no need for proof nets!) so that they are transparent to the presence of ESs. Combine this with the above idea (that ESs should behave like in linear logic, i.e., be "linear", substituting one occurrence at a time) and you obtain the LSC.

The LSC is extremely simple (the usual syntax: variables, abstraction, application and ESs; only 3 rewriting rules, or even 2 if you can do away with garbage collection) and has spectacular rewriting properties (of course it preserves SN, but that's the least of it). Its best quality, as far as I am concerned, is that you do not need to be a rewriting expert (I certainly am not) to appreciate its usefuless! In particular, there are two areas in which it found important applications:

abstract machines: with Beniamino and Pablo Barenbaum, we showed how the LSC may be related in a very sharp way to a host of abstract machines, for any reduction strategy you can think of (call by name, call by value, call by need). Each strategy in the LSC "distills" an abstract machine, in the sense that the rewriting rules of the LSC are in bijection with the actual computational transitions of the machines, while the "administrative" transitions (needed by the machine to navigate through the term in search for the next redex) sort of "evaporate" and become structural equivalences in the LSC (they relate syntactically different terms which are behaviorally equal). In particular, the LSC gives the slickest formulation of (weak) call-by-need I am aware of. With other collaborators, Beniamino has gone much further in this direction.
complexity: for decades, people had been wondering whether $\beta$ -reduction is a reasonable cost model, meaning: if I have a $\lambda$ -term $M$ deciding a language $L\subseteq\{0,1\}^\ast$ , and I show that, for all $n\in\mathbb N$ and $w\in\{0,1\}^n$ , $M\,w$ reduces to its normal form in less than $p(n)$ head reduction steps, where $p$ is a polynomial, have I shown that $L\in\mathsf{P}$ ? (Deterministic polynomial time). Most were convinced that the answer was "no", because $\beta$ -reduction may make an arbitrary number of duplications in just one step. In particular, there are examples of sequences of terms $M_n$ , of size $O(n)$ , normalizing to a term of constant size (e.g. a variable) in $O(n)$ head reduction steps but such that the size of the terms encountered along the head reduction itself is exponential in $n$ . Suprisingly, Beniamino Accattoli and Ugo Dal Lago showed that the answer is in fact "yes", and their proof uses the LSC in a crucial way, in particular its capability of succinctly expressing computation thanks to sharing (jokingly, one could say that, without ESs, $\lambda$ -terms are a very inefficient representation of themselves :-) ).

From the categorical viewpoint, the LSC is a blank sheet: no one has looked at it from that perspective yet! Personally, I see it more as a particularly useful example of the "rewriting at-a-distance" methodology, which may have nice applications for describing the internal logic of categories with "positive" constructors (tensor, coproduct, linear logic exponential...).

Apologies for the overly long post, but if someone here is interested in explicit substitutions, I thought that they might find this supplementary perspective interesting!

Valeria de Paiva (Aug 24 2021 at 23:44):

Thanks for the discussion @Damiano Mazza ! yes, the previous time I gave this talk I spoke straight after Beniamino in a meeting organized by Mauricio Ayala Rincon, so I did mention our different motivations and I mentioned his and Hugo's "cost model", which is indeed a nice and pleasant surprise. But indeed what I find surprising is that no one looked at the LSC calculus or any other ES calculi using categorical tools, apart from us! this is the reason why I wanted to talk about it in the colloquium. the connection to implicitly computational complexity (ICC) is interesting, but I haven't got a categorical angle on it. (yet, perhaps.)

From the categorical viewpoint, the LSC is a blank sheet: no one has looked at it from that perspective yet! Personally, I see it more as a particularly useful example of the "rewriting at-a-distance" methodology, which may have nice applications for describing the internal logic of categories with "positive" constructors (tensor, coproduct, linear logic exponential...).

well, some of us would like to see "rewriting" also as category theory. in my case along the lines of Barney Hilken's "Towards a proof theory of rewriting: the simply typed 2λ-calculus". and the notion of positive/negative constructors I find it not very clear. but sure, I think this alternative perspective is very welcome!

Mike Shulman (Aug 25 2021 at 02:57):

Thanks very much @Damiano Mazza, that's fascinating and added a whole lot to my reading list. Is there any connection between the LSC and Lamping's so-called "optimal" reduction method for lambda-calculus, which proceeds by graph rewriting and I believe is also closely connected to linear logic?

Damiano Mazza (Aug 25 2021 at 09:31):

Valeria de Paiva said:

what I find surprising is that no one looked at the LSC calculus or any other ES calculi using categorical tools, apart from us!

Yes, the problem is that, as you know better than me, the world of programming languages is (mistakenly, I think) split in "semantic-minded" and "syntactic-minded" people. Those who do rewriting theory strongly tend to belong to the second group, and since the equation "categories = semantics" is widely held to be true (which is, in my opinion, another misconception), rewriting people have a tendency to stay away from categories. This is why I deeply appreciate the kind of work you have presented! It points to the fact that categorical tools can be useful for rewriting too. In that case, if rewriting theorists don't go to categories, categories will end up going to them :-)

some of us would like to see "rewriting" also as category theory. in my case along the lines of Barney Hilken's "Towards a proof theory of rewriting: the simply typed 2λ-calculus". and the notion of positive/negative constructors I find it not very clear.

I am "some of us"! :-) Hilken's approach (initially advocated by Seely) that rewriting happens in the second dimension of a 2-categorical framework is fully exploited in my work (with my former Ph.D. students Luc Pellissier and Pierre Vial) on a general explanation of intersection types. Admittedly, though, this idea that "rewriting = 2-arrows" for the moment is more of a working hypothesis than an actual, fully-formed theory. (I'll ping @Tom Hirschowitz on this: he has thought a lot about the question and, more generally, about categorical approaches to defining the syntax of programming languages, he recently made some advances on this and may have something interesting to say).

In fact, it seems plausible that rewriting theory (which is more than just the rewriting rules) takes place from dimension 2 upwards in a higher categorical framework. The 3-arrows would be "standardization paths" (ways of transforming rewriting sequences into "well behaved" ones, which are often studied in rewriting theory) and (invertible) 4-cells too seem to have a meaning in terms of rewriting (ask Paul-André Melliès :-) ). What's missing, I think, is a good understanding of where this higher-categorical structure comes from. In "nice" programming languages and logical systems, the rewriting rules usually result from orienting equalities coming from categorical structures, but since these structures are "just" 1-categorical, they say nothing about the induced rewriting theory. It would be nice to say, for example, that the $\lambda$ -calculus is the free $n$ -category (or, perhaps, $\infty$ -category) with such and such property/structure, so that the property/structure itself automatically induces all the higher arrows corresponding to $\beta$ -reduction and its associated rewriting theory (notice that strict $n$ -categories should be enough for rewriting, so there is no question as to what these are precisely). I remember that Eric Finster was doing something of this sort for multiplicative linear logic, using an opetopic approach, but I don't know how far he managed to go.

A naive observation is that, in simple cases (e.g. propositional connectives), the 2-arrows corresponding to reduction rules are generated by a weakening of the isomorphism defining the connective in a multicategorical setting. For example, implication is defined by the (natural) iso

$\mathcal C(\Gamma,A;B)\cong\mathcal C(\Gamma;A\to B)$

(where I am writing $\mathcal C(A_1,\ldots,A_n;B)$ for the homset of a cartesian multicategory $\mathcal C$ ). Now, if we go up one level, and consider $\mathbf{Cat}$ -enriched multicategories, the iso may be weakened into an adjunction (with some extra compatibility properties), with the functor going in the forward direction (the one "introducing the connective", left to right in the above iso) being the right adjoint. If you look at the counit of this adjunction, it's exactly $\beta$ -reduction, whereas the unit is $\eta$ -expansion. This works with every propositional connective (including those of linear logic, if the multicategory is just symmetric), except that, in some cases, the forward functor is the left adjoint. I wrote this up with a bit more detail in Sect. 1.1.5 of my habilitation thesis. Notice that the multicategorical setting simplifies Seely's original approach, which resorts to lax 2-adjunctions and requires the simultaneous presence of products (which, as you said, people do not necessarily want to be "forced upon" them). This is another advantage of using multicategories, as @Mike Shulman was pointing out above.

The right adjoint/left adjoint distinction matches the standard classification of logical connectives as "negative" and "positive", respectively. It also corresponds to the distinction, well-known in the proof theory of linear logic, of those connectives whose right rule (resp. left rule) is reversible in sequent calculus. This is what I was alluding to when I spoke of "positive constructor". For reasons that I do not fully understand, positive connectives are "messy" in terms of rewriting: in term calculi, they are usually handled by means of "let" binders, inducing all those annoying commutation rules ("commuting conversions"). These may be eliminated by appealing to the "rewriting at-a-distance" methodology of the LSC, which is a great benefit, because I don't think that commuting conversions are going to get any satisfactory categorical explanation, as they are not computational in nature, they are just artefacts of (a certain treatment of) the syntax.

Damiano Mazza (Aug 25 2021 at 09:56):

Mike Shulman said:

Is there any connection between the LSC and Lamping's so-called "optimal" reduction method for lambda-calculus, which proceeds by graph rewriting and I believe is also closely connected to linear logic?

No, there is no direct connection. Lamping's sharing graphs implement $\beta$ -reduction as a local process, duplicating terms "piece by piece" (so the intermediate steps do not correspond to any term at all; in fact, not even normal forms do!), whereas the LSC works by a parsimonious form of "traditional", non-local duplication.

Now, it is plausible that one may concoct a term calculus isomorphic to sharing graphs using the same "rewriting at-a-distance" methodology of the LSC. But it would yield something entirely different. In fact, the difference is so radical that, contrarily to ordinary $\beta$ -reduction, counting reduction steps in sharing graphs is not a reasonable cost measure, as shown long ago by Andrea Asperti and Harry Mairson. The failure is actually catastrophic: there is a non-elementary gap (meaning, not bounded by any tower of exponentials of fixed height!) between optimal reduction and Turing machines. Although I do not fully understand the details of Asperti and Mairson's paper, I think that what's going on is that, as I mentioned above, normalization in sharing graphs does not reach an "honest" normal form: the normal form of a sharing graph which is the image of a $\lambda$ -term $M$ is, in general, not the image of the normal form of $M$ , but a highly compressed form of it. What Asperti and Mairson show is that the decompression operation is not elementary recursive. The LSC does not have this problem because, in its syntax, one may judiciously decompress terms as it is needed (but not more, otherwise the exponential explosion in the size of terms would affect it too), and the resulting normal form is the "true" normal form (the normal terms of the LSC are exactly the normal plain $\lambda$ -terms).

Mike Shulman (Aug 25 2021 at 13:50):

Thanks! It's intriguing, then, that there are two different applications of linear logic to lambda-calculus reduction.

Mike Shulman (Aug 25 2021 at 14:30):

@Damiano Mazza What paper would you recommend to read first as an introduction to LSC?

Mike Shulman (Aug 25 2021 at 15:21):

Also, I wonder whether you are aware of any work explicitly relating ES to NbE?

Cody Roux (Aug 25 2021 at 16:09):

Not sure what I understand what's going on here, but there's a whole section in Andreas' Habilitation: https://www.cse.chalmers.se/~abela/habil.pdf

Mike Shulman (Aug 25 2021 at 18:43):

@Cody Roux If you mean section 3.5, it seems to be just about putting explicit substitutions in the syntax that is being normalized, rather than relating the NbE algorithm itself to ES rewriting as I suggested further up in this thread (that's what I meant to be asking about).

Cody Roux (Aug 25 2021 at 20:04):

So do you want an NbE that evaluates terms up-to ES reductions, or...?

Damiano Mazza (Aug 25 2021 at 20:21):

Mike Shulman said:

What paper would you recommend to read first as an introduction to LSC?

This question is harder to answer than what it should be :-) A survey paper would be great, but unfortunately there's none at the moment. So I'm going to go ahead with a shameless self-promotion and suggest our ICFP 2014 paper on the LSC and abstract machines. I really think it's is a good starting point because it presents several subcalculi/strategies of the LSC, so it offers a fairly broad picture of what can be done with it. Just skip all the technical details about the machines (which are, like, 3/4 of the paper). Also, that paper comes with a video of the 20-minute talk I gave back then, in which I stress the linear-logical roots of the LSC in a way that is not done in any paper I am aware of (including our own!).

On the downside, our ICFP 2014 paper has basically no result about the rewriting theory of the LSC itself. Those may be found in Beniamino's RTA 2012 paper, which is where the LSC was first introduced (in Sect. 7), but it's probably too technical to be a first introduction. I know that Beniamino gave several excellent talks on the LSC, but unfortunately none seems to have been recorded and made available on the web :-( If I find one, I'll point it out here.

Damiano Mazza (Aug 25 2021 at 20:22):

Mike Shulman said:

Also, I wonder whether you are aware of any work explicitly relating ES to NbE?

No, I am not. If you do find anything, please let me know. Maybe it'll be the time I finally understand NbE! :-)

Damiano Mazza (Aug 25 2021 at 20:51):

Mike Shulman said:

It's intriguing, then, that there are two different applications of linear logic to lambda-calculus reduction.

Well, the situation is something like this:

$\mathrm{LambdaCalculus} \longrightarrow \mathrm{LSC} \longrightarrow \mathrm{ProofNets} \dashrightarrow \mathrm{SharingGraphs}$

where each item is a category whose objects are terms (or nets/graphs) and whose arrows are reduction sequences (so they are all free categories). The arrows are functors, while the dashed arrow is not, it's just a function from objects to objects, but with some kind of simulation property, modulo that "decompression" I was talking about previously.

In fact, if you equip $\mathrm{LSC}$ with non-identity isomorphisms corresponding to structural equivalence (along the lines of our ICFP 2014 paper), and if you are careful with how you formulate proof nets (for one thing, they should be intuitionistic), then $\mathrm{LSC}\rightarrow\mathrm{ProofNets}$ is an equivalence (it should be possible to do things so that $\mathrm{ProofNets}$ is the skeleton of $\mathrm{LSC}$ , that's the whole idea of proof nets).

What I was suggesting is that you could probably use "rewriting at a distance" to replace sharing graphs with an equivalent term calculus $\mathrm{SharingTerms}$ , and you would get

$\mathrm{LambdaCalculus} \longrightarrow \mathrm{LSC} \dashrightarrow \mathrm{SharingTerms}$

So the two applications you are talking about (if I understand correctly) result from the existence of the first functor, for one, and the composition of that functor with the optimal reduction map, for the other.

Damiano Mazza (Aug 25 2021 at 21:06):

Of course the functor $\mathrm{LambdaCalculus}\to\mathrm{LSC}$ is very special. It maps every object to "itself", because each $\lambda$ -term is in particular a term of the LSC. It is trivially faithful but it is certainly not full as I stated things, because there's lots of reduction sequences in LSC which differ just by the order in which ES rules substituting on different occurrences of the same variable are applied, and these should all be equated. The functor just picks one particular order. It would be interesting to see if, once the category $\mathrm{LSC}$ is properly formulated, the functor actually becomes fully faithful (intuitively, it seems like it should be).

Damiano Mazza (Aug 25 2021 at 21:13):

Unfortunately, though, the good properties of the LSC do not seem to be expressible in terms of the functor $\mathrm{LambdaCalculus}\to\mathrm{LSC}$ . For example, it is unclear how one would state preservation of strong normalization (some kind of op-lifting property?). So maybe this is more of a way of visualizing the situation than anything of actual technical value.

Mike Shulman (Aug 25 2021 at 21:15):

@Cody Roux what I originally said was:

Another odd thought I had today is that explicit substitutions seem a little bit related to normalization by evaluation. If we think of the "evaluation in an environment" step of NbE as pushing a multiple explicit substitution through a term, and give rules saying that it gets stuck when it hits an abstraction and doesn't go further in until the abstraction is applied, at which point it gets augmented by the term being substituted (which could be a variable at readback time).

Mike Shulman (Aug 25 2021 at 21:18):

This connection seems even stronger with LSC, where the original ES rule for pushing substitutions inside abstractions, $(\lambda x. t)[y \mapsto u] \leadsto \lambda x. t[y\mapsto u]$ is replaced by the rule for " $\beta$ -reduction at a distance" $(\lambda x.t)[\vec{y}\mapsto \vec{u}] v \leadsto t[x\mapsto v, \vec{y}\mapsto \vec{u}]$ .

Mike Shulman (Aug 25 2021 at 21:20):

That is, a substituted lambda $(\lambda x.t)[\vec{y}\mapsto \vec{u}]$ , which if I understand correctly doesn't reduce further in LSC until it is applied, seems closely analogous to the "defunctionalized closures" that appear as the result of evaluating a lambda in some environment in NbE.

Jacques Carette (Aug 25 2021 at 21:47):

How much of the LSC has been formalized in a proof assistant? I'd be curious to see how much pain the various things that are hand-waved away in the usual paper presentations cause.

Damiano Mazza (Aug 26 2021 at 05:32):

Jacques Carette said:

How much of the LSC has been formalized in a proof assistant? I'd be curious to see how much pain the various things that are hand-waved away in the usual paper presentations cause.

I'm going to ask Beniamino about this, I honestly don't know.

My knowledge of proof assistants is very limited, but one time I did team up with a Coq expert in my lab (Micaela Mayero, who works on formalizing real numbers, floating point arithmetic, correctness of programs for numerical analysis, etc., so very far from $\lambda$ -calculus and rewriting) and supervised an undergrad project in which a couple of students formalized (in Coq) the proof of the so-called IE property (implicit-explicit) of the LSC, something that implies preservation of strong normalization via a general, abstract argument due to Delia Kesner (Valeria briefly mentioned this in her talk). On paper, the proof is a couple of pages long, including the preliminary definitions. I'm not sure it was much harder than formalizing an average rewriting result about the pure $\lambda$ -calculus (Micaela certainly wasn't shocked by the amount of work it took).

I remember that some inductions, intuitively trivial on paper, did become a bit tricky to formalize because of the non-locality of reduction rules in the LSC (e.g. the fact that a redex in the LSC has the form $(\lambda x.M)[-]N$ where $[-]$ is an arbitrarily long sequence of explicit substitutions). But I don't think it would be anything scary for someone used to formalizing programming languages.

Damiano Mazza (Aug 26 2021 at 05:39):

Damiano Mazza said:

It would be interesting to see if, once the category $\mathrm{LSC}$ is properly formulated, the functor actually becomes fully faithful (intuitively, it seems like it should be).

I said something very silly here, sorry. The functor $\mathrm{LambdaCalculus}\to\mathrm{LSC}$ cannot be full, because of the presence of reduction sequences in the LSC that are not mere simulations of $\beta$ -reductions, of course. For example, in the LSC we can do call-by-need, which does not correspond to any reduction strategy in the $\lambda$ -calculus.

Damiano Mazza (Aug 26 2021 at 06:32):

Mike Shulman said:

Also, I wonder whether you are aware of any work explicitly relating ES to NbE?

It is well-known that ES are related to abstract machines, so I looked for a relationship between abstract machines and NbE. I found this paper. In Sect. 2.1, they start from a call-by-name "evaluator" and, via successive steps (closure conversion, CPS, defunctionalization), derive the Krivine abstract machine from it. In Sect. 2.2, they start from a call-by-value machine (the CEK), apply (the inverse of) those steps in reverse order, and obtain an "evaluator" for call-by-value. In the rest of the paper, they do the same for other machines.

Now, I am not familiar enough with NbE to be sure that the "evaluators" they talk about are actually implementing NbE. It looks like they do, as this other paper seems to confirm: it cites the first paper and explicitly claims that the transformations of that paper extract "NbE functions" from abstract machines.

So NbE and ES may actually be related after all, by means of this two-way transformation and the correspondence of ES with abstract machines. But maybe I'm wrong.

Mike Shulman (Aug 27 2021 at 19:14):

Thanks! More good reading.

Mike Shulman (Aug 27 2021 at 19:35):

@Damiano Mazza, I've taken the liberty of adding some of your remarks to the nLab page on explicit substitutions, so that they won't be lost in this conversation history. Please edit further if you like!

Mike Shulman (Aug 27 2021 at 20:42):

Damiano Mazza said:

Now, I am not familiar enough with NbE to be sure that the "evaluators" they talk about are actually implementing NbE. It looks like they do, as this other paper seems to confirm: it cites the first paper and explicitly claims that the transformations of that paper extract "NbE functions" from abstract machines.

On a quick glance, this looks right to me. It seems the significant difference here is that a mere "evaluator", which the first paper shows to correspond to the KAM, implements weak (head) reduction (not going under lambdas). Whereas "normalization by evaluation" is a method for strong reduction (including going under lambdas), and this second paper shows it to correspond similarly to an abstract machine "KN" for strong reduction.

Mike Shulman (Aug 27 2021 at 20:44):

I'm surprised at how many of these papers about abstract machines use de Bruijn indices (or levels) literally. It makes them very hard for me to read. I was pleased that your ICPF paper "Distilling abstract machines" not only used named variables but was explicit about how $\alpha$ -equivalence is implemented in the machines. Do you know of any papers describing an abstract machine for strong reduction that use named variables?

Valeria de Paiva (Aug 27 2021 at 20:59):

well, we have also a paper "On Explicit Substitution and Names " presented at ICALP'97. which is about having both names and de Bruijn variables, both typed and untyped lambda-calculus, calculus for type theory vs calculus for implementations https://www.researchgate.net/publication/2505679_On_Explicit_Substitutions_and_Names

Valeria de Paiva (Aug 27 2021 at 23:04):

@Damiano Mazza said:

With 30 years of hindsight, I think that trying to "fix" explicit substitution (ES) systems so that they preserve SN was not a "wrong goal", because it ended up pushing the development of better behaved ES calculi in all respects, not just SN.

well, I do think it was the wrong goal, because as you said my direction is different: all this so-called 'better behaved' ES calculi have no semantics, so they are not better behaved in my books! Syntax is an unruly beast, clever people will do all sorts of weird things with it. We use categorical semantics to make sure that we're not creating monsters.

Valeria de Paiva (Aug 27 2021 at 23:40):

@Damiano Mazza said

Cut-elimination in proof nets has the amazing property (from the rewriting viewpoint) that it does not need commuting conversions.

indeed, this is one reason why Girard created proof nets. But I do not agree with

Explicit substitutions should behave like boxes in proof nets:

because I do not think you need to use a specific proof system (proof nets) to reason about syntax in general. You should be able to stick to lambda-calculus, if you prefer. Or sequent calculus or Natural Deduction or deep inference; whatever floats your boat really. in my case, category theory is the preferred semantics. and so I'm very happy to hear that

I am "some of us"! :-)

Yay! Thanks for the several suggestions of avenues to pursue!

Damiano Mazza (Aug 28 2021 at 07:57):

Mike Shulman said:

Do you know of any papers describing an abstract machine for strong reduction that use named variables?

There's a whole line of work by Beniamino Accattoli and coauthors on machines for strong reduction, none of which uses de Bruijn indices (he's a rather fierce opponent of them :-) ). The starting point was our paper on an abstract machine for leftmost-outermost reduction (which may be seen as a global-environment version of Crégut's "KN" machine mentioned above), and then there are machines for strong useful reduction and strong call by value. The development of a machine for strong call by need, which, morally, would be an abstraction of what Coq does, is still underway (as far as I know). There's also a nice paper comparing "global environment" to "local environment" machines, for weak as well as for strong evaluation. The upshot is that the latter, although a bit more complex to formulate, are more efficient for weak reduction, but the advantage disappears for strong reduction, where they are asymptotically equivalent.

Damiano Mazza (Aug 28 2021 at 08:03):

All of these machines "distill" to a strong reduction strategy of the LSC, in a way which asymptotically preserves complexity, so, for many purposes, studying them or the corresponding LSC strategy is essentially the same thing.

Tom Hirschowitz (Aug 30 2021 at 10:10):

Thanks all for the nice discussion and to @Damiano Mazza for pinging me!

The rewriting-at-a-distance methodology has a little-known precursor called destruct-time $\lambda$ -calculus, see, e.g., the (app) rule in Fig. 2 of Sewell et al.
About entry points to LSC: Beniamino gave a very nice invited talk at FSCD '19.
A slight correction to one of @Valeria de Paiva's claims: one of Delia Kesner's many calculi with explicit substitutions is covered in Reduction monads and their signatures (and @Ambroise's thesis).
I don't have much to add to @Damiano Mazza's excellent exposition, as my recent work focuses on going beyond rewriting to more general operational semantics (e.g., covering languages in which evaluation may not occur anywhere in the term).

Damiano Mazza (Aug 30 2021 at 11:43):

Thanks @Tom Hirschowitz! I didn't know that invited paper by Beniamino, it really is a great entry point to the LSC (certainly much better than our ICFP 2014 paper, which, in comparison, is way too technical and biased towards abstract machines).

Valeria de Paiva (Aug 30 2021 at 15:07):

Thanks @Tom Hirschowitz for the extra information!

A slight correction to one of Valeria de Paiva's claims: one of Delia Kesner's many calculi with explicit substitutions is covered in Reduction monads and their signatures (and Ambroise's thesis).

It's good to see some initial semantics for a variant of explicit substitutions in Dec 2019!
But no mention of the old work and no ability to deal with dependent type theory--if I'm reading it right.
I only looked at Ambroise Lafont's thesis slides in https://raw.githubusercontent.com/amblafont/these-slides/master/slides.pdf so far. Meanwhile our approach deals with Calculus of Constructions, modal type theory, linear type and lambda sigma.

Mike Shulman (Aug 30 2021 at 15:28):

If I can go back to the original subject of your talk, @Valeria de Paiva, and ask a dumb question: in what sense are the indexed-category models more closely related to explicit substitutions than other models such as CCCs? In both cases there is an operation in the categorical semantics, call it cartesian restriction or composition, that corresponds to substitution in the syntax, and I don't understand how either form of semantics can "see" whether the substitution is explicit or implicit.

Valeria de Paiva (Aug 30 2021 at 16:15):

not a dumb question at all. and I'm getting out of my comfort zone here, but indeed you do have mechanisms in the indexed category model to track composition and application of substitutions. these are different operations denoted by ";" (for composition of substitutions) and * for application of a substitution to a term, so the syntax (and the abstract machine) can see whether a substitution is indeed explicit or not.

the point is always to make the correspondence between the syntax and cat semantics the most explanatory that we can.
I thought the bit in the paper about intensional notions of substitution would appeal to you, as we say

The situation would be analogous to the intensionality of function spaces: In the same way as t wo functions are not
intensionally equal if they produce the same result when applied to the same arguments, two substitutions are not intensionally equal if applied to the same variable they produce the same result.

Mike Shulman (Aug 30 2021 at 17:14):

I don't see how separating the operations of composition and substitution has anything to do with whether those operations are explicit or implicit.

Mike Shulman (Aug 30 2021 at 17:14):

Put differently, in an explicit substitution calculus the terms $M[x\mapsto N]$ and $M\langle x\mapsto N\rangle$ are distinct. Are there corresponding distinct objects in the indexed-category semantics?

Mike Shulman (Aug 30 2021 at 17:15):

Put differently again, ordinary categorical semantics is already quotiented by $\beta$ -reduction. An ES calculus augments the terms with extra constructors, but also augments the $\beta$ -reduction with new rewrites that get rid of those extra constructors. So how can its semantics be any different?

Valeria de Paiva (Aug 30 2021 at 18:15):

$M[x\mapsto N]$ and $M\langle x\mapsto N\rangle$ are distinct.
Are you writing $M[x\mapsto N]$ for the implicit substitution? if so, It's not part of the calculus, right? hence it's not in the model.

composition of explicit subs happens in the base category, while application of a subst to a term happens in the fibres, so two different cats.

Damiano Mazza (Aug 30 2021 at 20:50):

Let me put the question in different terms. If I understand correctly (and by that I mean both Mike's question and Valeria's talk, so I could be doubly wrong :-) ), if $M$ is a closed normal form, then the interpretation of both $x\langle x\mapsto M\rangle$ and of $M$ itself are arrows in the fiber above (the interpretation of) the empty context. Now, since these two terms rewrite one into the other, these interpretations must actually coincide. Right?

Valeria de Paiva (Aug 30 2021 at 21:00):

Right! in the empty context you only do what you can do in CCCs. There, on the fibre over 1, you can recover all your traditional CCC modelling, you don't lose anything that you had before. but you do have all your other fibres, right?

Damiano Mazza (Aug 30 2021 at 21:11):

If that's the case, then I think I understand a bit better what's going on in this categorical semantics. The difference with respect to an ordinary categorical model of the $\lambda$ -calculus is that we are explicitly interpreting substitutions: they are arrows of a base category, whose objects are contexts. Composition in the base category, like Valeria said, is composition of substitutions.

The arrows in the fiber above a context are terms with free variables corresponding to that context. A substitution $\sigma:\Gamma\to\Delta$ induces a map from terms $M$ with free variables in $\Delta$ to terms $M\sigma$ with free variables in $\Gamma$ , where $M\sigma$ is $M$ to which the substitution $\sigma$ has actually been applied, not the term $M\langle\sigma\rangle$ containing an explicit substitution. Well, I mean, it is also that term, because they have the same interpretation, but I hope you see what I want to say. For example, the substitution $\langle x\mapsto M\rangle$ maps $x$ to $M$ .

Now, if I am not mistaken, this means that while this semantics certainly gives categorical status to the operation of substitution, it is still the ordinary, "implicit" notion of substitution that we are talking about here. Of course, as Mike was pointing out, it is difficult to imagine something different because calculi with explicit substitutions have exactly the same normal forms as calculi without them, so their categorical semantics cannot be different. Going back to what we were saying before about categories and rewriting, I think that only an explicit treatment of reductions (perhaps as higher arrows) can show the difference between something like $M$ and $x\langle x\mapsto M\rangle$ .

Damiano Mazza (Aug 30 2021 at 21:15):

In particular, although it is of course interesting to have pinned down what substitution is categorically, I do not see how this treatment may be helpful in designing calculi with explicit substitutions, in the sense of guiding us towards the "right" reduction rules, because these are invisible in the semantics. Or am I missing something?

Mike Shulman (Aug 30 2021 at 21:58):

Yes, @Damiano Mazza, that's exactly what I was saying. Thanks for putting it more explicitly.

Valeria de Paiva (Aug 30 2021 at 23:03):

I think you're missing @Damiano Mazza that you can also pinpoint in the semantics where there are syntax design choices that you should consider and where they don't make sense. This is what I meant by category theory as a debugging tool for syntax. and sure, having a theory of reductions that can be higher-order would be great, but the level we have them is already a compromise!

Tom Hirschowitz (Aug 31 2021 at 06:16):

Valeria de Paiva said:

But no mention of the old work[...]. [N]o ability to deal with dependent type theory--if I'm reading it right.
[...] Meanwhile our approach deals with Calculus of Constructions, modal type theory, linear type and lambda sigma.

Well, that's a bit tough: Section 1.1 is devoted to related work. Maybe it was aggressive to say I was correcting your claim, not at all my intention, sorry about that if it's the case!

The two lines of work are indeed only loosely connected, for two main reasons.

If I understand correctly, you consider languages one at a time, and for each language you carefully design a category of models, which you try to make as canonical and legit as possible. (This is what you mean by "avoid creating monsters".) And in all cases, you use the same kind of categorical structures, which gives unity to (this part of) your work. Is this more or less right? @Ambroise et al., on the other hand, do not worry about creating monsters: they propose a notion of signature for specifying programming languages -- and some signatures certainly do create monsters.
Moreover, you're right that linearity and type dependency are not covered by their framework (neither is simple typing, for that matter, but that has been done now). The point here is that the linguistic focus is different: you seem to be mainly interested in logical calculi (and indeed you mention implementation of proof assistants during the talk), while @Ambroise et al. are interested in programming languages. Typically, this means languages where reductions may not occur anywhere in the term.

That said, it is certainly desirable to find notions of signatures covering both wild programming languages and logical calculi, including type dependency and linearity! E.g., it would be nice to be able to specify the ML language with a signature, including polymorphic types and the module language. I don't know how to do this for now. Maybe @Daniel Gratzer and @Jonathan Sterling do?

Damiano Mazza (Aug 31 2021 at 06:51):

Ok, thanks everyone for helping me understand! At this point, I think it's worth observing that the indexed category semantics is not particularly tied to the original $\lambda\sigma$ -calculus approach to explicit substitutions (ES). It applies just as well (or so it seems to me) to any ES calculus which is consistent with equivalences such as

$M\langle x\mapsto N\rangle\langle y\mapsto P\rangle\equiv M\langle x\mapsto N\langle y\mapsto P\rangle\rangle\quad\text{if }y\text{ is not free in }M$
$M\langle x\mapsto N\rangle\langle y\mapsto P\rangle\equiv M\langle y\mapsto P\rangle\langle x\mapsto N\rangle\quad\text{if }y\text{ is not free in }N$

(I haven't checked exactly which equivalences are needed, but the above certainly are, because they result from the assumption that substitutions form a finite product category).

By "consistent" I do not mean that the ES calculus must implement those equivalences as rewriting. For example, the LSC does not, but it is compatible with them (they are included in so-called "structural congruence"). So I have to correct my previous statement that the LSC has no categorical semantics: it does, at least in the sense of indexed categories! :-)

Damiano Mazza (Aug 31 2021 at 07:15):

Going back to references and formalization, I asked Beniamino and he confirms that:

his invited paper to FSCD 2019, previously mentioned by Tom, is the best entry point. He also suggests the slides of a graduate course he taught, although of course they are less self-explanatory and less self-contained (they occasionally refer to other parts of the course).
Not much has been done in terms of formalizing the LSC. Apart from my personal (very limited) experience in Coq, he knows of another LSC-related formalization, concerning Milner's translation of the $\lambda$ -calculus in the $\pi$ -calculus (this paper explains what this has to do with the LSC). This formalization is done in Abella. It's hard to compare with what one would need to do in, say, Coq, but again, the difficulties do not seem exceptional.

Jon Sterling (Aug 31 2021 at 12:53):

@Tom Hirschowitz In my experience the distinction between PL’s (that allow reduction only in some places) and type theories (that allow reduction everywhere) is more a methodological difference than a technical one. There is a somewhat reliable way to turn a “PL” in that sense into a congruent rewriting theory (and hence an equational theory): reformulate the PL as some machine calculus for which reductions are better understood. The implicit evaluation order of the PL becomes explicit… This is what I understand to be the “french style”, currently embodied in the work of Guillaume M-M and others. The limitations of this approach really only seem to arise when considering true nondeterminism and concurrency, but i am reluctant to see these as rewrites/equations anyway.

For this reason, I have at least in my own work been quite comfortable abandoning evaluation order / operational semantics until I absolutely need it.

Jon Sterling (Aug 31 2021 at 12:55):

And of course, Levy’s call by push value is the beginning of this approach for me… I have been using that a lot lately to demolish evaluation order.

Valeria de Paiva (Aug 31 2021 at 15:53):

hi,

Well, that's a bit tough: Section 1.1 is devoted to related work

sometimes it's difficult to write what we mean. I meant 'no mention of our old work', not non-mention of any related work. Since I thought ours was the only model of explicit substitutions so far, it would make sense to google "(categorical) model of explicit substitution" and see what came up. But indeed, easier said than done, I'm totally guilty of not checking too.

And no worries about correcting my claim: @Ambroise's work is certainly a counterexample to what I thought. While I do agree that

The two lines of work are indeed only loosely connected

@Ambroise's work is not what I think of, when I say "categorical semantics".

And in all cases, you use the same kind of categorical structures, which gives unity to (this part of) your work. Is this more or less right?

exactly!

E.g., it would be nice to be able to specify the ML language with a signature, including polymorphic types and the module language. I don't know how to do this for now.

By ML you do mean Standard ML the Milner, Gordon, et al programming language https://en.wikipedia.org/wiki/ML_(programming_language), correct?
I believe that Ritter did System F in his master's thesis "eine kategorielle maschine fur den polymorphen lambda kalkul" (but I don't read German). also ML polymorphism is not the same as System F polymorphism, but still, it's a start.

Valeria de Paiva (Aug 31 2021 at 15:58):

@Damiano Mazza said

(I haven't checked exactly which equivalences are needed, but the above certainly are, because they result from the assumption that substitutions form a finite product category).

well, they form a finite product category in the case of the simply typed lambda-calculus . if you want to do linear lambda-calculus you need a tensor, not products, ie linear-handling-categories.

Jon Sterling (Aug 31 2021 at 16:16):

btw. As mike pointed out, every categorical model is a model of explicit substitution. So it is a little hard to know what it means for something to be the ‘only’ or the ‘first’. But maybe i am confused in the same way some other commenters were.

Tom Hirschowitz (Aug 31 2021 at 16:38):

Valeria de Paiva said:

@Ambroise's work is not what I think of, when I say "categorical semantics".

I agree with this.

And on the other hand, your model is not what I think of when I say "categorical operational semantics". The thing generated by @Ambroise's signature is something like a graph internal to some category of modules. (Hence, to comment on @Jonathan Sterling's later post, the signatures and graphs with and without explicit substitutions are definitely distinct!)

So in a sense you were right to ignore @Ambroise's work, @Ambroise was right not to cite yours, and I was the only one wrong, as often, to talk before thinking hard enough.

By ML you do mean Standard ML the Milner, Gordon, et al programming language https://en.wikipedia.org/wiki/ML_(programming_language), correct?
I believe that Ritter did System F in his master's thesis "eine kategorielle maschine fur den polymorphen lambda kalkul" (but I don't read German). also ML polymorphism is not the same as System F polymorphism, but still, it's a start.

Yes, I meant that ML. I agree that it's a start, and I also think Don Sannella had something about this question. But we're still quite far from a notion of signature that would be sufficiently expressive to generate ML, right?

Tom Hirschowitz (Sep 01 2021 at 05:32):

Can't find Sannella's account... Who was it then?

Valeria de Paiva (Sep 02 2021 at 15:25):

Sorry, @Tom Hirschowitz I don't know.