You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Hello, friends.
I was wondering if there is a formally defined category for programming. I know about Hask, but people say that is not an actual category due to problems such as non-terminating functions. Bartoz claims that this can be (sort of) amended by adding a bottom value that belongs to any type. So, can we claim that is a category where types are objects, and pure referentially transparent function are morphisms? If we add "terminating functions", can we then claim that is a category without this bottom?
There's discussion of the subtleties around this in this blog post. As a short answer, in a comment Andrej says:
There is a perfectly good categorical treatment of Haskell in terms of denotational semantics. It uses (a suitable version of) -CPOs. It explains exactly what's going on. It clears up misunderstanding about products and monads, and all the other stuff, by showing how we need to argue with inequalities instead of equalities, etc. But this is not Hask. There is no Hask and people should not be told there is one. It is not that hard to learn a bit of domain theory.
I don't fully comprehend the answer. I mean, as far as I understand, people try to define the Hask "category" as a way to encompass all Haskell in Category Theory.
I'm actually interested in something more modest.
I'm actually working in Julia. I just want a "subset" of the programming language to form a category.
I mean, can I assume that is a category types are objects and pure referentially transparent terminating functions are morphisms?
Is this enough to claim that is a subcategory of ?
The question is hugely dependent on the language you're working with and on its semantics, as Dylan pointed out via Bauer
At a very superficial level is ok to think of functions in a programming language as encoding functions in . But as soon as you actually look into it this stops making sense very quickly, because of shared contexts, non-termination and a myriad of other corner cases. I expect a language like Julia to have a pretty involved semantics if you want to be fully rigorous about it.
Another very important fact distinguishing actual programming languages from their idealized version in is extensionality. Two functions with the same extensional beh aviour (computing equal values from equal inputs) might still be wildly different programs. In Set, this difference is not even expressible since functions are just input-output pairs so extensional equality is all you have...
Davi Sales Barreira said:
I know about Hask, but people say that is not an actual category due to problems such as non-terminating functions. Bartoz claims that this can be (sort of) amended by adding a bottom value that belongs to any type.
It's not a technical problem from a category theory point of view that programs can be non-terminating. A category can have partial functions as morphisms (or even far more exotic things). The problem pointed out in the blog post linked by Dylan Braithwaite is that:
seq undefined () = undefined
andseq (undefined . id) () = ()
, therefore we do not haveundefined . id = undefined
.
I don't know the operational definition of "undefined"; I wonder if someone could explain why it exhibits this behaviour?
It's more about seq
. seq a b
is supposed to be undefined if a
is, and otherwise return b
. In order to make this possible when a
is of function type, the function undefined . id = \ x -> undefined
must be different from undefined
.
Because seq f b
cannot compute f x
for all possible x
, to see if they are all undefined
.
If seq
didn't exist, then the only thing you could do with a function would be to apply it, so you wouldn't be able to distinguish undefined . id
and undefined
.
Or in other words, you wouldn't be able to distinguish \ x -> f x
and f
.
My take on this is more computer scientists should talk to physicists, or even better economists, who every day grapple with what it means for squishy reality to be a mathematical object. To pick a famous example you can happily say that physical space is , and that works totally fine unless you push your scale too big or too small when the model breaks down. So this isn't a constructive answer, but the correct answer to "what category corresponds to [real life thing]" the answer is always "what do you need it for?". You can come up with models of nontermination using domain theory or game semantics or operational semantics or whatever, and that works fine as long as you're fine to ignore some other gnarly features of Haskell
I agree generally but one difference is that definitely is an actual mathematical object, whereas it's not clear what (if anything) "Hask" (or "Prog") is meant to refer to, in addition to the question of how closely it corresponds to the real world.
Reid Barton said:
If
seq
didn't exist, then the only thing you could do with a function would be to apply it, so you wouldn't be able to distinguishundefined . id
andundefined
.
Why does seq
exist? :upside_down:
Reid Barton said:
I agree generally but one difference is that definitely is an actual mathematical object, whereas it's not clear what (if anything) "Hask" (or "Prog") is meant to refer to, in addition to the question of how closely it corresponds to the real world.
Mostly commonly when people say "Hask" they are referring to . Sometimes they're referring to something more like
Okay. That seems confusing to me, and it's certainly not what I would mean by Hask--because I would just call those categories Set, respectively DCPO.
I suspect that, in practice, people are not referring to anything in particular.
It's clearer in a better-behaved language like Agda, where I think the the types and terms literally do form a category
How can the problem with seq
possibly be resolved by semantics in other categories? Is seq
just ignored? And if so, why can't one do the same thing to directly form a category of Hask types?
The fact is that function composition in Haskell just isn't associative. You can literally write 3 functions such that and are observationally distinguishable
If you want to pretend that Haskell is a category then you have to ignore language features that cause it to fail, just like how if you want to pretend that the universe is a Euclidean space then you have to ignore physical effects that cause that to fail
Jules Hedges said:
The fact is that function composition in Haskell just isn't associative. You can literally write 3 functions such that and are observationally distinguishable
Sounds interesting, what happens?
Morgan Rogers (he/him) said:
Why does
seq
exist? :upside_down:
I'm not sure how detailed an answer you want, but in short its purpose is to control space usage without having to specialize to specific types. (For specific non-function types you can implement seq
, e.g. seq a b = if a == 0 then b else b
for a : Integer
.)
I don't think associativity can fail. But presumably it depends on what you mean by "function" and "observationally distinguishable", and I don't have definitions of those.
For example if the observations include "unsafeCoerce the resulting functions to Ptr
and read the heap data structures from C" then I agree that associativity can fail.
Matteo Capucci (he/him) said:
Another very important fact distinguishing actual programming languages from their idealized version in is extensionality. Two functions with the same extensional beh aviour (computing equal values from equal inputs) might still be wildly different programs. In Set, this difference is not even expressible since functions are just input-output pairs so extensional equality is all you have...
Agree. Julia is too complex and not functional enough to pacakge as such a simple category as Set. My idea is to indeed consider it as a simplification, as @Jules Hedges suggested.
Here is what I wrote:
"As we’ve pointed out, Julia does not enforce functions to be pure or referentially transparent.
Julia also does check if two composing functions match input and output types. Thus, this
is left for us programmers to enforce. If the functions we define follow these theoretical
assumptions (i.e. our functions are pure, referentially transparent, terminate, and only
compose with functions that match domain and codomain), then we can assume that we are
working in the category Prog."
Yet, I think the above might still not actually true.
Here is another take. This one is from Awodey's book.
image.png
image.png
This is the only thing he says on the subject. I was wondering where I could find the more detailed explanation for this.
Reid Barton said:
Because
seq f b
cannot computef x
for all possiblex
, to see if they are allundefined
.
It's just occurred to me that this is a strange interpretation. Why should "undefined" for a function type coincide with the function which everywhere takes the value "undefined"? The latter has an explicit definition, after all.
Well, in any event, they don't coincide in Haskell.
I now realise that this is what you were saying in the line preceding what I quoted. But then why should undefined . id = \ x . undefined
?
Because f . g = \x -> f (g x)
I still don't think that's true. With type annotations your equation is (undefined :: a -> b) . (id :: a -> a) = \(x :: a) . (undefined :: b)
which I don't think is true
It is true
Side comment: Lots of fun stuff that became folkloric about ways to make Haskell go wrong are in Oleg's blog. For example you can get up to shenanigans by throwing an exception from a pure function using error
(which probably ought to be tagged unsafe
but isn't) and catching it using the proper IO exception mechanism: https://okmij.org/ftp/Haskell/#observe-strictness
Davi Sales Barreira said:
Here is another take. This one is from Awodey's book.
image.png
image.pngThis is the only thing he says on the subject. I was wondering where I could find the more detailed explanation for this.
The difficulty here is that Awodey’s example is slightly oversimplified. It isn’t always obvious what it should mean to compose two programs by applying one to the other. For example if your morphisms are genuinely just pieces of programming language syntax and you say is defined as the function , then you don’t strictly satisfy the identity rule for a category. is not exactly equal as a string of characters to .
To make this work you need to expand what it means for two morphisms to be equal. Usually this is done by defining types of equivalence like . This gets you something that would be called a “syntactic category”. But Haskell is such a complicated language that to iron out all of the obstructions like this is a complicated task. And by the time you are done the category you have left might be different to the way that people informally think of .
Dylan Braithwaite said:
Davi Sales Barreira said:
Here is another take. This one is from Awodey's book.
image.png
image.pngThis is the only thing he says on the subject. I was wondering where I could find the more detailed explanation for this.
The difficulty here is that Awodey’s example is slightly oversimplified. It isn’t always obvious what it should mean to compose two programs by applying one to the other. For example if your morphisms are genuinely just pieces of programming language syntax and you say is defined as the function , then you don’t strictly satisfy the identity rule for a category. is not exactly equal as a string of characters to .
To make this work you need to expand what it means for two morphisms to be equal. Usually this is done by defining types of equivalence like . This gets you something that would be called a “syntactic category”. But Haskell is such a complicated language that to iron out all of the obstructions like this is a complicated task. And by the time you are done the category you have left might be different to the way that people informally think of .
Thanks Dylan. I was wondering if there is an encompassing definition somewhere that properly defines this category of programming. I mean, I have no problem in saying that a certain language is not actually a Category due to extra stuff. I actually intend to work with simplifying assumptions such as claimming that two functions that return the same values for the same input are the same.
The other bit of that passage was about how once you have a syntactic category, you can take functors on it as giving ‘interpretations’ of your programs in different categories. Although it is hard to make this formal for Haskell or Julia, I think it would be fairly uncontroversial to say that functions you write in Julia under the list of pretty restrictive assumptions you gave correspond to certain morphisms of
There isn’t a general ‘programming’ category because there are lots ways for programming languages to work, and they might work in contradictory ways. However if you have a presentation of the types and basic term constructions you care about in your language you can construct a [[syntactic category]] over that.
Reid Barton said:
Because
f . g = \x -> f (g x)
For the unary case, a consistent convention would be f = \x . f(x)
, but that isn't universal, as witnessed by undefined
. So does Haskell just fail to be a category because the conventions on functions are inconsistent? For instance, if I instead identified any expression involving undefined :: a -> b
to also be undefined
of the relevant type, would that resolve the issue with seq
? (I'm aware that Haskell already has this conflict built in, so this is just a hypothetical)
I'm not sure I understand the question, but certainly it's an essential aspect of Haskell that fst (3, undefined) = 3
and not undefined
, no matter what type the undefined
is at
By the way, there are various ways to fix the unit law, for example by making the composition more strict, or restricting morphisms to functions that are already lambdas.
Reid Barton said:
I'm not sure I understand the question, but certainly it's an essential aspect of Haskell that
fst (3, undefined) = 3
and notundefined
, no matter what type theundefined
is at
Slightly more subtly, undefined :: (a, b)
and (undefined :: a, undefined :: b)
are different, since they are distinguished by the pattern (_, _)
Reid Barton said:
I'm not sure I understand the question, but certainly it's an essential aspect of Haskell that
fst (3, undefined) = 3
and notundefined
, no matter what type theundefined
is at
But is snd (3,undefined) = undefined
or \x. undefined
?
The former
So snd
is not treated like a function in Haskell... This discussion is giving me a lot of insight into the ways in which Haskell was not designed categorically!
Huh? snd
is a function
And it satisfies snd (x, y) = y
for all x
, y
.
It's just f = \x -> f x
that doesn't hold.
Then why isn't snd (3,undefined) = \x . snd(3(x),undefined(x)) = \x . undefined
? You told me before that this is how composition works
I am very confused
What does .
mean to you?
You're using it to mark the body of the lambda right? That's sort of confusing in Haskell because .
is the function composition operation
Based on what you're writing, I should be writing \x -> f x
snd (3,undefined) = \x . snd(3(x),undefined(x))
3
is probably an Integer
or something and not a function; I don't understand where this equation came from
So the distinction is that even though snd
is a function, snd (f, g)
is not snd . (f, g)
?
Aha
f x
means "f
applied to x
"
Maybe that is the starting point.
Composition is .
.
(f, g)
is an ordered pair, so it wouldn't make sense to compose it with anything anyways.
f x
for function application is pretty standard in ML derivatives, so I forgot it might be unfamiliar.
I think my confusion is coming from not distinguishing terms from values
I would suggest not trying to force things into a categorical viewpoint too much. In particular terms/values are not morphisms from the terminal object or anything, they are just terms.
Reid Barton said:
It's just
f = \x -> f x
that doesn't hold.
I'm not very surprised to learn this, but I can't think of a context to distinguish them, do you know how to do it?
Making the distinction, I could refine what I said previously to taking the convention undefined . f = undefined
instead of defining composition universally via the lambda expression seems to fix the problem?
Yes, you can do this (I think you had better do it for f . undefined
too, to get the other unit law as well) and you can even implement it in Haskell... using seq
!
So to return to the original question, is that the adaptation that allows one to construct categorical semantics for Haskell?
Just for fun (making Haskell go wrong is a sort of hobby), you can also break these equations by using the fact that you have multiple undefined values other than undefined
that you can observationally distinguish
Screenshot-2023-02-23-at-12.42.34-pm.png
"ohnoes" is a high quality name for an error
Doing something like this is necessary to fix associativity... but what's really missing is a formal definition in the first place. It will involve some choices (e.g. do I have exceptions in addition to non-termination, and can I tell them apart), and a lot of writing-things-down work.
Jules Hedges said:
Reid Barton said:
It's just
f = \x -> f x
that doesn't hold.I'm not very surprised to learn this, but I can't think of a context to distinguish them, do you know how to do it?
It's exactly this seq undefined
on functions: seq undefined z = undefined
but seq (\ x -> undefined x) z = z
.
It is known that many typed lambda calculi form categories this way (called syntactic categories). Well, STLC surely does. But then you have to do weird and dangerous things with -equivalence.
there is a huge amount of work on categorical semantics for various type theories (lambda calculi) and it's something I studied extensively in grad school and after - I am happy to answer questions
Jules Hedges said:
My take on this is more computer scientists should talk to physicists, or even better economists, who every day grapple with what it means for squishy reality to be a mathematical object. [...] the correct answer to "what category corresponds to [real life thing]" the answer is always "what do you need it for?".
This is the intention behind the paper Fast and Loose Reasoning is Morally Correct which defines a PER semantics over a standard domain theoretic interpretation of a Haskell-like language that defines when too things are "morally equivalent", meaning equivalent as if we reasoned as if everything was total.
The Internet being what it is, the title unfortunately led to some heated arguments over the whether or not a person's programming style could be considered "immoral" or not.
https://arxiv.org/pdf/2010.05167.pdf
Jules Hedges said:
It's clearer in a better-behaved language like Agda, where I think the the types and terms literally do form a category
Here is an instance of Agda (as in Agda's types and terms) being a category in Jacques Carette's agda-categories library.
Stelios Tsampas said:
Jules Hedges said:
It's clearer in a better-behaved language like Agda, where I think the the types and terms literally do form a category
Here is an instance of Agda (as in Agda's types and terms) being a category in Jacques Carette's agda-categories library.
Here Set
can be made into an internal category, but that doesn't necessarily mean that there is a category of its types and terms, those are two different things
I'm not sure I follow...
Ah, I see. Are you saying that the fact that "Set o" (as it appears in line 16) forms a category (with the expected functions as morphisms, identity morphisms etc.) is not the same as Agda's types and terms generally being a category?
The interpretation of Agda's Set
is the class of all sets. (Otherwise, the interpretation of a Category
wouldn't be a category. Let's ignore that we are working without quotients/function extensionality so that we need these setoids.)
Then, the interpretation of the category Set
is the category of sets. But the category that is being talked about here is one whose objects are actual types (as in syntax) of Agda.
I'm sorry, what do you mean by Set
here? The Agda keyword?
Right
So this is an abbreviation of Set 0
. And it forms a Categories.Category
. I accept Categories.Category
being a suitable notion of the mathematical "category". What is the problem again?
The category that was being discussed here is one whose objects and morphisms are given by syntax somehow. In particular, whatever the details of the construction are, it must have only countably many objects and morphisms.
basically, you're saying that, with Agda as a metatheory, you can prove that Set
is a category, but here the question is about which category could be the semantics for Agda's types and terms
If we take for granted that there is a category of Agda types and terms, then the construction in Jacques' library can be (kind of) seen as describing an [[internal category]] in whose object of objects is Set 0
Josselin Poiret said:
basically, you're saying that, with Agda as a metatheory, you can prove that
Set
is a category, but here the question is about which category could be the semantics for Agda's types and terms
Categories.Category
Sets l
of course!
I guess what I am saying only works if I consider type theory a self-contained foundations for math, hence for category theory, which I absolutely do. I feel no need to interpret anything beyond the confines of type theory/Agda itself.
But I guess the question implies different foundations, which means I'm wrong.
I think there's no problem with that. But when you're comparing Haskell with Agda, and discussing why it's easier to define a syntactic category for Agda than it is for Haskell (as we were above), then you have to work externally to Agda's type theory. Which means that you have to distinguish between a category and a Categories.Category
, and between the syntactic notion of an Agda type and a term X : Set
in Agda.
Stelios Tsampas said:
I guess what I am saying only works if I consider type theory a self-contained foundations for math, hence for category theory, which I absolutely do. I feel no need to interpret anything beyond the confines of type theory/Agda itself.
But then Set
isn't enough internally for programming language semantics, as the OP talked about. That's the main problem with set-based foundations
Josselin Poiret said:
Stelios Tsampas said:
I guess what I am saying only works if I consider type theory a self-contained foundations for math, hence for category theory, which I absolutely do. I feel no need to interpret anything beyond the confines of type theory/Agda itself.
But then
Set
isn't enough internally for programming language semantics, as the OP talked about. That's the main problem with set-based foundations
What is the problem with Set
?
Dylan Braithwaite said:
I think there's no problem with that. But when you're comparing Haskell with Agda, and discussing why it's easier to define a syntactic category for Agda than it is for Haskell (as we were above), then you have to work externally to Agda's type theory. Which means that you have to distinguish between a category and a
Categories.Category
, and between the syntactic notion of an Agda type and a termX : Set
in Agda.
Well, maybe the question can be phrased as "Are there acceptable foundations under which Haskell/Agda form categories", so I'm not 100% convinced that you have to. But I hear you.
Note that while Agda itself can only show countably many things inhabit Set o
for any given level o
(and we never ever talk about 'all' Set
s as that collection does not want to exist), that doesn't imply that most natural models are countable. Incompleteness is rampant in these waters, so who knows what the size of the 'true' things really is; we just know that the size of all the provable ones is indeed countable.