You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
I have run into a question about mixfix parsing where I'm not sure what the answer should be. Suppose we are in a context where there is a binary operation that's denoted by juxtaposition with no operator symbol in between. In ordinary algebra this could be multiplication , while in type theory or -calculus it could be function application . The question is what kinds of operations are allowed to appear internally to the arguments of such an operation without parenthesization.
Certainly infix operators should not be: means , not , so can't appear unparenthesized inside an argument of juxtaposition.
It also seems clear to me that at least sometimes "around-fix" or "outfix" operators are allowed. For instance, means , so the outfix absolute value operator can appear unparenthesized inside an argument of juxtaposition. (In fact, parentheses themselves can be regarded as an outfix operator, in which case it's absolutely essential that they can appear here without further parentheses.) I haven't yet encountered a case of an outfix operator that I thought clearly shouldn't be allowed here, so I'm tentatively willing to say they always should be.
The question I don't know the answer to is about prefix and postfix operators. In some cases it seems they should be allowed. For instance, I think most people would agree that
so it seems that the prefix integral operator can appear unparenthesized in the right-hand argument of juxtaposition. And I think probably , so the postfix factorial operator can appear unparenthesized in the left-hand argument of juxtaposition.
On the other hand, I think most of us would interpret as , even though in this case it happens to equal , so the prefix negation operator isn't allowed inside the left-hand argument of juxtaposition. And I don't know whether would mean or , so probably it is a confusing thing to write at all, but I think is a defensible interpretation.
These last two paragraphs suggest a general rule that prefix operators are allowed inside the right-hand argument of juxtaposition but not the left-hand argument, while postfix operators are allowed inside the left-hand argument of juxtaposition but not the right-hand argument. This is nice and principled, and goes well in hand (especially for an implementation) with the rule for outfix operators, since at the moment when you are in between the two arguments of juxtaposition, a postfix operator on the left "looks the same as" the ending of an outfix operator on the left, and similarly a prefix operator on the right "looks the same as" the beginning of an outfix operator on the right.
But it all breaks down because is not the same as . What's going on? Is it the fact that there also exists an infix operator "" that prevents from being parsed as ? Or is there some sense in which "" and "" have "higher precedence" than juxtaposition while "" has lower precedence? (This would be a different notion of precedence than the usual one -- the latter only applies to the "open" sides of notations, i.e. on the right of a prefix operator, the left of a postfix operator, or both sides of an infix operator -- and it would be strange because there would be only two possible such "precedences", "tighter than juxtaposition" and "looser than juxtaposition".)
Is it the fact that there also exists an infix operator "" that prevents from being parsed as ?
Conceptually, I think that is why: there is both a prefix operator and infix operator with the same symbol, and the infix operator takes precedence.
My perspective on juxtaposition is as you suggest at the end: I view it as just another infix operator, and then apply standard precedence rules, e.g. that multiplication has higher precedence than addition. Then parentheses are necessary only when two operators appear with equal precedence.
I don't follow why this notion of precedence is any different from the usual one, though, save viewing juxtaposition itself as an operation.
(In the specific example of the factorial operator, I would be inclined to give it the same precedence as multiplication, because if I see an expression like , I would be inclined to read it as an infix operator and I wouldn't be sure how to read .)
I think the Danielsson-Norell article Parsing mixfix operators provides some good mental models for thinking about these questions. They handle juxtaposition in a reasonable way (they say it denotes function application, not multiplication, but this doesn't really matter). My key takeaway was that not every case of parsing ambiguity comes from precedence. Problems like the ones caused by unary minus vs subtraction show up whenever two operators share name-like parts, and precedence assignment is not the right way to fix them.
By the way, the proof assistant Agda implements something close to the article's proposal. And indeed, Agda could handle the subtraction vs unary-minus multiplication example in a human-like way... modulo the major caveat that Agda does not let you engineer a situation where the following three tests all type-check:
infixl 6 _-_
postulate
- : ℕ → ℕ
_-_ : ℕ → ℕ → ℕ
test1 : (x y : ℕ) → - x y ≡ (- x) y
test1 x y = refl -- would go through
test2 : (x y : ℕ) → x - y ≡ _-_ x y
test2 x y = refl -- would go through
test3 : (x y : ℕ) → x - y ≡ x (- y)
test3 x y = ? -- would fail
You can get quite close by using --type-and-type
and letting ℕ
denote the type of the polymorphic identity function. Unfortunately, in that case (x -) y
also type-checks, so all occurrences of x - y
become ambiguous.
Nathanael Arkor said:
because if I see an expression like , I would be inclined to read it as an infix operator
Do you have this inclination when looking at the denominator of the common -choose- formula
? Or does that feel different?
In this case, is used as a postfix operator in two locations in the term, so it is intuitive to read the third occurrence also as postfix. Though I don't think I would be comfortable having that kind of reasoning automated in a proof assistant.
Nathanael Arkor said:
I don't follow why this notion of precedence is any different from the usual one, though, save viewing juxtaposition itself as an operation.
I do view juxtaposition as an operation, generally with infinite precedence. But the ordinary precedence of a prefix operator governs what can appear after it. So for instance, unary minus binds tighter than addition so that means , and binds looser than juxtaposition so that means . Ordinary precedence doesn't (in my view) govern what comes before a prefix operator.
The "purpose" of precedence is to disambiguate expressions like . But that only happens at the "boundary" between two notations that are both "open" on the inside, like infix-infix or prefix-infix or prefix-postfix or infix-postfix. For a combination like infix-prefix, for instance , there is no possible ambiguity, so this can be allowed regardless of the precedence of and .
...except when "" is juxtaposition written without a symbol and could also be an infix operator. So something different is going on here.
I learned a lot from Danielsson-Norell, and my first try at a parser was implementing their algorithm. But it was too slow and I wasn't enough of a coding wizard to speed it up, so I switched to a Pratt-style parser that has almost no backtracking and it got much faster. (Interestingly, the Pratt-style approach seems to require precedence to be a total ordering.)
But thanks for reminding me of the DN paper, because I switched away from it so long ago that I'd forgotten that they do also take account of precedence when joining operators that aren't both open on the inside. So for them, would only parse if binds tighter than . Now that I'm used to the other way, that seems weird to me; e.g. I would expect to be able to give a tighter precedence than so that would parse as and yet still write for . Maybe I should look at the paper by Aasa that they mention as having this behavior.
I guess the fundamental question I'm asking is "why does not parse as ?"
One possible answer is that precedence applies to closed operator combinations, and unary minus does not have tighter precedence than juxtaposition.
Another is that the alternative parse as binary subtraction somehow "overrides" the parse as . @Nathanael Arkor, were you suggesting this as a general principle? What would the general rule be? Infix operators always override prefix ones?
From the examples that come to mind, I think "infix operators always override prefix ones" would capture my intuition for what typically happens.
Though I wouldn't be surprised if there turned out to be exceptions.
Are there examples that come to your mind other than unary minus / binary subtraction? I can't think of any offhand.
I can't think of any! I don't think anyone writes the reciprocal of as .
Digressing, when introducing the axioms for abelian groups I always feel a bit uncomfortable defining to be , because we're using the same symbol in two ways, and students at that particular level, trying to learn more formal mathematics, may find that unsettling if they're thoughtful enough.
It's even worse because nowadays I get students who don't really understand the meaning of substitution, so sometimes they do things like plug in for in and get .
That's at an earlier stage than abstract algebra, I hope. :grimacing:
It's been a few years since I taught abstract algebra, but I do definitely have students like that in my intro to proofs course.
Mike Shulman said:
Are there examples that come to your mind other than unary minus / binary subtraction? I can't think of any offhand.
Now that I think about it again, the examples I had in mind of operators that are both pre/suffix and infix were coming from programming languages, but these tend not to have juxtaposition, so it doesn't introduce ambiguity. Two contrived examples could be:
But it seems clear to me that such situations, rare as they might be, should always be interpreted as the binary operators, because there is no way to disambiguate in favour of the binary operation, whereas it is simple to disambiguate in favour of the unary operation by parenthesising.
I didn't realize there were programming languages that used % as a suffix percentage operator.
But yes, the absence of juxtaposition does make things a lot simpler.
Nathanael Arkor said:
there is no way to disambiguate in favour of the binary operation, whereas it is simple to disambiguate in favour of the unary operation by parenthesising.
That's a very persuasive argument!
Mike Shulman said:
I didn't realize there were programming languages that used % as a suffix percentage operator.
Oh, it was only meant as a hypothetical example. I don't know if there are any languages that have such an operator, but it doesn't seem completely implausible.
Ah, I see. Another hypothetical example that occurred to me is that ~
is sometimes used as a binary relation (i.e. a binary operator that outputs a Prop) and as a prefix notation for logical negation.
Hmm, I suppose the infix multiplication operator *
is also used as a unary prefix dereference operator in C...
That makes me wonder whether you’re thinking about ternary infix operators at all, Mike. I always found it interesting that the ternary operator is practically the only one is standard use.
Yes, the parser I'm working on for Narya, like those of other proof assistants such as Rocq, Agda, and Lean, supports arbitrary mixfix notations that can have arbitrary arity. Notations like that aren't common in simply typed programming languages, but they're ubiquitous in mathematics and commonly used in proof assistants.
So one can ask this question about more general notations. For instance, you could define both a prefix version of ?:
as well as the usual infix one (?_:_
and _?_:_
in Agda syntax), and then Nathanael's rule would say that X ? Y : Z
is interpreted as the infix version because the prefix one could be disambiguated with X (? Y : Z)
.