Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: learning: questions

Topic: Well-Typed Programs/Algebraic Structures

Corey Thuro (Mar 18 2024 at 02:28):

I am wondering about the possibility of abstracting program execution in strongly-typed languages into an algebraic structure, with the aim of representing the programs as polynomials; where a well-typed program would be analogous to a polynomial root.

I am relatively new to mathematics and computer science, so apologies if this question is terminologically imprecise or too vague to be meaningful. If it is meaningful, I am wondering if there is research being done into anything like this. If it is too vague to be meaningful, I'm happy to be told so :)

I post this here because I wonder whether the representation could be aided or established via interactions between some category like Set and another whose objects are data types and morphisms are programs.

Peva Blanchard (Mar 18 2024 at 08:26):

Does the origin of your question relate to a possible analogy between "algebraic data types" and "algebraic numbers" (roots of polynomials with rational coefficients)?

Eric M Downes (Mar 18 2024 at 23:48):

Peva's question will probably yield fruitful inquiry via google.

But there is a sensible analogy I can think of, and your intuition is good. I'll do 80% of the existence proof and ask you to complete it.

Say your polynomials are in an "integral domain" $R$ . That is, a ring (has $+$ and $*$ , and $*$ distributes over $+$ ) in which multiplying two-non-zero elements will never give you zero. This is the usual home for polynomials for mathematicians. You can think of it like the reals $\mathbb{R}$ if that helps.

Let a program $\tilde{f}$ correspond to a polynomial $f$ , and a proposed program composition $\tilde{f}\circ\tilde{g}$ will correspond to a product in $R$ of the polys $f*g$ By "proposed" I mean, they may or may not actually compose, but the code as-written asks them to. You can prove that no new zeros will be introduced, nor any old zeros lost. (Though they might double up..)

So this tells us that if we have such a polynomial as you describe, it behaves in an intuitive way wrt to program composition. Now let's construct one such function by appealing to something we already know that compilers do.

Say you have a function $z:\mathrm{Syntax}\to\mathbb{N}$ where $z$ counts the number of type-signatures that do not line up. Do all your code replacement, so any C templates, function decorators, monads, ASTs, blah blah etc. are all applied, but we aren't at machine code yet -- there are still types; the resulting functions being asked to compose must have the correct type signatures. I claim something like $z$ must be called by the compiler to just count the number of type-signature mismatches.

So for example, we have reduced to functions
$\tilde{a}:A\to TB, ~\tilde{b}:B\to C,~ \widetilde{Tb}:BT\to TC; ~TB\neq B$ . Then
$z(\tilde{b}\circ\tilde{a})\neq 0, ~ z(\widetilde{Tb}\circ\tilde{a})=0$ .

Any polynomial $f:\mathbb{R}\to\mathbb{R}$ with a root at zero and no other natural number will indeed give you a root just when composed with $z$ . This is a constructible function by appealing to: compilers do type-checking, and strongly typed languages refuse to compile compositions of mismatched signatures.

The only thing we haven't shown (and our $z$ is useless for) is:
$z(\tilde{f}\circ \tilde{g})=0$ implies $z(\tilde{f})=0$ and $z(\tilde{g})=0$ ; because $0$ is absorbing. This would follow if the lhs $\tilde{f}\circ\tilde{g}$ could only be written when $\tilde{f}\circ\tilde{g}$ was type-consistent. But as I said we're considering "proposed" compositions...

So. Can you think of a way of separating the zeros at compile time?

The actual roots don't need to mean anything, you just need to describe an algorithm for forming a polynomial of degree $k$ with only distinct (non-doubled-up) roots, where $k$ is the number of requested compositions, knowing that you already have the ability via $z$ to count type-mismatches.

Corey Thuro (Mar 19 2024 at 17:16):

Peva Blanchard said:

Does the origin of your question relate to a possible analogy between "algebraic data types" and "algebraic numbers" (roots of polynomials with rational coefficients)?

I know a (very) little bit about algebraic data types and algebraic numbers and I will be sure to look into them more! My question came out of a more basic intuition about the algorithmic quality of polynomials and a similar (maybe misguided) gut association re: polynomial roots and well-typed programs. I associated both of them with something like successfully picking a lock.

Corey Thuro (Mar 19 2024 at 17:23):

Eric M Downes said:

Peva's question will probably yield fruitful inquiry via google.

But there is a sensible analogy I can think of, and your intuition is good. I'll do 80% of the existence proof and ask you to complete it.

Say your polynomials are in an "integral domain" $R$ . That is, a ring (has $+$ and $*$ , and $*$ distributes over $+$ ) in which multiplying two-non-zero elements will never give you zero. This is the usual home for polynomials for mathematicians. You can think of it like the reals $\mathbb{R}$ if that helps.

Let a program $\tilde{f}$ correspond to a polynomial $f$ , and a proposed program composition $\tilde{f}\circ\tilde{g}$ will correspond to a product in $R$ of the polys $f*g$ By "proposed" I mean, they may or may not actually compose, but the code as-written asks them to. You can prove that no new zeros will be introduced, nor any old zeros lost. (Though they might double up..)

So this tells us that if we have such a polynomial as you describe, it behaves in an intuitive way wrt to program composition. Now let's construct one such function by appealing to something we already know that compilers do.

Say you have a function $z:\mathrm{Syntax}\to\mathbb{N}$ where $z$ counts the number of type-signatures that do not line up. Do all your code replacement, so any C templates, function decorators, monads, ASTs, blah blah etc. are all applied, but we aren't at machine code yet -- there are still types; the resulting functions being asked to compose must have the correct type signatures. I claim something like $z$ must be called by the compiler to just count the number of type-signature mismatches.

So for example, we have reduced to functions
$\tilde{a}:A\to TB, ~\tilde{b}:B\to C,~ \widetilde{Tb}:BT\to TC; ~TB\neq B$ . Then
$z(\tilde{b}\circ\tilde{a})\neq 0, ~ z(\widetilde{Tb}\circ\tilde{a})=0$ .

Any polynomial $f:\mathbb{R}\to\mathbb{R}$ with a root at zero and no other natural number will indeed give you a root just when composed with $z$ . This is a constructible function by appealing to: compilers do type-checking, and strongly typed languages refuse to compile compositions of mismatched signatures.

The only thing we haven't shown (and our $z$ is useless for) is:
$z(\tilde{f}\circ \tilde{g})=0$ implies $z(\tilde{f})=0$ and $z(\tilde{g})=0$ ; because $0$ is absorbing. This would follow if the lhs $\tilde{f}\circ\tilde{g}$ could only be written when $\tilde{f}\circ\tilde{g}$ was type-consistent. But as I said we're considering "proposed" compositions...

So. Can you think of a way of separating the zeros at compile time?

The actual roots don't need to mean anything, you just need to describe an algorithm for forming a polynomial of degree $k$ with only distinct (non-doubled-up) roots, where $k$ is the number of requested compositions, knowing that you already have the ability via $z$ to count type-mismatches.

Eric thank you so much for this thoughtful response. I really really appreciate it. I am in the middle of a busy work week so it may take me a few days but I am going to go through everything you said carefully and try to complete the existence proof, and I am looking forward to doing so.

Peva Blanchard (Mar 20 2024 at 00:11):

Corey Thuro said:

Peva Blanchard said:

Does the origin of your question relate to a possible analogy between "algebraic data types" and "algebraic numbers" (roots of polynomials with rational coefficients)?

I know a (very) little bit about algebraic data types and algebraic numbers and I will be sure to look into them more! My question came out of a more basic intuition about the algorithmic quality of polynomials and a similar (maybe misguided) gut association re: polynomial roots and well-typed programs. I associated both of them with something like successfully picking a lock.

Here's what I had in mind.

Let's consider a collection $V = \{x_1, x_2, \dots\}$ of real numbers, and let's think of $V$ as a type. This type should denote a set of programs, i.e., a set of polynomials. It is interesting to choose this set to be the set $I(V)$ of polynomials $P$ with rational coefficients such that $P(x) = 0$ for all $x \in X$ .

Interestingly, when you have a collection $I = \{p_1, p_2, \dots\}$ of programs (polynomials), you can form a type: namely $V(I)$ the collection of numbers $x$ such that $p(x) = 0$ for all $p \in I$ . Then we could say that a polynomial $p$ is well-typed if $V(\{p\})$ is not empty.

Now if you have types $U$ , $V$ , $W$ , which are just sets, you can consider their union $U \cup V$ , intersection $U \cap V$ , and $U \Rightarrow_W V = \{ w \in W | w \in V \textrm{ or } w \not\in U\}$ .

It's quite fun to describe $I(U \cup V)$ , $I(U \cap V)$ , and $I(U \Rightarrow_W V)$ .

It is even more fun to describe $I(V(J))$ for an arbitrary set $J$ of polynomials.

ps: this perspective is a very basic introduction to algebraic geometry. You can start with Hilbert's Nullstellensatz.

Graham Manuell (Mar 20 2024 at 06:00):

I'm surprised no one has mentioned the Seven Trees in One paper yet. Also see this paper for a more general result.

David Corfield (Mar 20 2024 at 08:54):

Graham Manuell said:

this paper

In that vein, see @John Baez's TWF202.

Mike Stay (Mar 21 2024 at 16:47):

@Graham Manuell wrote

Also see this paper for a more general result

Yeah, Fiore and Leinster's Objects of Categories as Complex Numbers (hereafter OCCN) is one of my favorite papers of all time. But I don't really see how it would apply here.

Programming languages are usually presented in two pieces. The first piece is the "syntax". For a language like C, this not only includes the text that a programmer writes, but also things like the representation of the stack and the heap. Each different component of the machine state gets a "sort", and there are "term constructors" for assembling things of each sort into a program. A simple example is the pi calculus, where there are two sorts, one for names and another for processes.

The syntax also usually includes any "structural congruences". These are equations between term constructors that remove unimportant distinctions between terms. For example, in pi calculus, processes form a commutative monoid with the "parallel composition" operator | as multiplication and the "do nothing" process 0 as multiplicative unit. This means imposing associativity, commutativity, and unit laws.

The second piece is the reductions. In lambda calculus, beta reduction is usually an equation, but if you made reduction into an equation in a non-confluent system like pi calculus, the whole structure would collapse. So reductions are usually taken to form the edges of a graph whose vertices are structural congruence classes of terms.

We can combine both of these into an algebraic theory (perhaps a higher-order theory if we assume a built-in mechanism for abstraction instead of implementing it ourselves). Sorts become generating objects; term constructors become generating morphisms; rewrites are an object equipped with "source" and "target" morphisms from the sort of rewrites to the sort of terms; structural congruence is a set of equations between term constructors; and we have equations from the formal expression src(r) and tgt(r) to the actual terms that are the source and target of a rewrite.

The free model of such a theory in Set picks out a graph. The vertices are programs that are merely well-sorted, not well-typed. Typing comes afterwards; you're picking out a subgraph of the reduction graph whose vertices (programs) have certain properties. The subgraph is defined inductively.

I could imagine an OCCN-like construction where you could ask what the complex cardinality is of the set of terms/vertices or the set of rewrites/edges in a subgraph: without equations, sorts behave like complex numbers and term constructors like polynomials. Perhaps there's some way to get equations into the game using groupoid cardinality. But I don't see how to square this with @Corey Thuro's intuition that programs themselves have cardinalities that are roots of polynomials.

Corey Thuro (Mar 26 2024 at 03:34):

Corey Thuro said:

Eric M Downes said:

Peva's question will probably yield fruitful inquiry via google.

But there is a sensible analogy I can think of, and your intuition is good. I'll do 80% of the existence proof and ask you to complete it.

Say your polynomials are in an "integral domain" $R$ . That is, a ring (has $+$ and $*$ , and $*$ distributes over $+$ ) in which multiplying two-non-zero elements will never give you zero. This is the usual home for polynomials for mathematicians. You can think of it like the reals $\mathbb{R}$ if that helps.

Let a program $\tilde{f}$ correspond to a polynomial $f$ , and a proposed program composition $\tilde{f}\circ\tilde{g}$ will correspond to a product in $R$ of the polys $f*g$ By "proposed" I mean, they may or may not actually compose, but the code as-written asks them to. You can prove that no new zeros will be introduced, nor any old zeros lost. (Though they might double up..)

So this tells us that if we have such a polynomial as you describe, it behaves in an intuitive way wrt to program composition. Now let's construct one such function by appealing to something we already know that compilers do.

Say you have a function $z:\mathrm{Syntax}\to\mathbb{N}$ where $z$ counts the number of type-signatures that do not line up. Do all your code replacement, so any C templates, function decorators, monads, ASTs, blah blah etc. are all applied, but we aren't at machine code yet -- there are still types; the resulting functions being asked to compose must have the correct type signatures. I claim something like $z$ must be called by the compiler to just count the number of type-signature mismatches.

So for example, we have reduced to functions
$\tilde{a}:A\to TB, ~\tilde{b}:B\to C,~ \widetilde{Tb}:BT\to TC; ~TB\neq B$ . Then
$z(\tilde{b}\circ\tilde{a})\neq 0, ~ z(\widetilde{Tb}\circ\tilde{a})=0$ .

Any polynomial $f:\mathbb{R}\to\mathbb{R}$ with a root at zero and no other natural number will indeed give you a root just when composed with $z$ . This is a constructible function by appealing to: compilers do type-checking, and strongly typed languages refuse to compile compositions of mismatched signatures.

The only thing we haven't shown (and our $z$ is useless for) is:
$z(\tilde{f}\circ \tilde{g})=0$ implies $z(\tilde{f})=0$ and $z(\tilde{g})=0$ ; because $0$ is absorbing. This would follow if the lhs $\tilde{f}\circ\tilde{g}$ could only be written when $\tilde{f}\circ\tilde{g}$ was type-consistent. But as I said we're considering "proposed" compositions...

So. Can you think of a way of separating the zeros at compile time?

The actual roots don't need to mean anything, you just need to describe an algorithm for forming a polynomial of degree $k$ with only distinct (non-doubled-up) roots, where $k$ is the number of requested compositions, knowing that you already have the ability via $z$ to count type-mismatches.

Eric thank you so much for this thoughtful response. I really really appreciate it. I am in the middle of a busy work week so it may take me a few days but I am going to go through everything you said carefully and try to complete the existence proof, and I am looking forward to doing so.

This is taking a bit longer to get to than I'd hoped. Life is busy! But here is what I came up with so far.

To construct a polynomial $l$ with degree $k$ and distinct roots:

Let $l$ be defined as the product

$a\cdot b \cdot \ldots \cdot k$

of distinct degree-1 polynomials $a, b, ... , k$ with the restriction that for any two polynomial factors of $l$ (say, $a, b$ ) and given polynomials $q, r$

$a = bq + r \implies r \neq 0$

This ensures that each polynomial will have at most one root; and since each polynomial is distinct and no one polynomial is a multiple of another, no two roots will be identical.

Now, if $z(\tilde{l}) = n$ for $0 < n \leqslant k$ , this should correspond to $n$ polynomial factors $a, b, \ldots, n$ of $l$ having values which aren't roots. The count of not-well-typed compositions and non-roots should be the same.

On the other hand, it should be that
$z(\tilde{l}) = 0 \iff z(\tilde{a} \circ \tilde{b} \circ \ldots \circ \tilde{k}) = 0 \iff$ the value of every polynomial factor $a, b, \ldots, k$ was a root.

Please let me know if my reasoning is off. Thanks again, I am learning a lot from this.

Corey Thuro (Mar 26 2024 at 03:38):

David Corfield said:

Graham Manuell said:

this paper

In that vein, see John Baez's TWF202.

Thank you David, I will check this out. Btw, I think Modal Homotopy Type Theory is a compelling book. I hope to spend more time with it.