Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: community: general

Topic: fun: Rem's Union Find Algorithm

Sam Tenka (May 02 2020 at 18:33):

Morgan Rogers said:

So a rule of thumb could be "if you ever find yourself looking something up on Zulip, record it in nLab before using it elsewhere"?

An amusing digression: this sort of rule is at the heart of Rem's beautiful Union Find algorithm.

The set-up of Rem is that on a finite poset $C$ , one is given a functor $F:C\to C$ that enjoys a natural transformation $\eta: F \to \text{Id}_C$ , where $F$ is represented on a computer as a look-up table. One is interested in the terminal $G:C \to C$ among those functors $G$ that obey $FG = G$ . One seeks to compute values of $G:C\to C$ for a bunch of user-inputted objects $x$ in $C$ . The naive way is: iterate $F$ starting from $x$ until convergence; since finite tosets are well-ordered, this halts.

The key insight of Rem is to apply @Morgan Rogers' rule on top of naive search: whenever we find that $G(x) = F^m(x) = y$ , then update the entries $F$ 's table so that $F^n(x) = y$ for all $n \leq m$ . This changes $F$ , but it preserves all of $F$ 's relevant properties including the goal $G$ . The benefit is that this potentially cuts out ``middlemen'' for the next query (on a different $x$ ). In fact, Tarjan showed that the running time, amortized over lots of queries, is big-O of the Inverse Ackermann Function --- nearly constant!!

Gershom (May 02 2020 at 18:42):

A bit of trivia I want to add here is that in compiler tech, the process of "closing up" references is typically known as "zonking". I believe this terminology is due to Simon Marlow, who first named a function that inside GHC and eventually the term made its way into some literature. I asked him once why he named it that and he replied "that's the sound that it makes."

Lê Thành Dũng (Tito) Nguyễn (May 02 2020 at 18:47):

Despite being familiar with Tarjan's Union-Find (both from programming contests and from research work) I couldn't understand anything from Sam's message. First of all, who is Rem? Second, why is there a total order? (Isn't the lookup table supposed to represent a forest, whose descendent relation is a partial order?) What are functors and natural transformations doing here?

sarahzrf (May 02 2020 at 23:16):

Gershom said:

A bit of trivia I want to add here is that in compiler tech, the process of "closing up" references is typically known as "zonking". I believe this terminology is due to Simon Marlow, who first named a function that inside GHC and eventually the term made its way into some literature. I asked him once why he named it that and he replied "that's the sound that it makes."

absolutely love that kind of naming

Sam Tenka (May 03 2020 at 00:05):

@Nguyễn Lê Thành Dũng

Despite being familiar with Tarjan's Union-Find (both from programming contests and from research work) I couldn't understand anything from Sam's message.
Sorry that I was unclear! I guess I presented the above more succinctly than clearly. I thought (perhaps incorrectly) that the concepts of functor and natural map would make it easy to communicate what structures are involved without having to define forests etc.

First of all, who is Rem?
Martin Rem. I think Tarjan first found a tight bound on Union-Find's complexity (without showing it was tight), but my naming follows Dijkstra, who in chapter 23 of his book "A Discipline of Programming", attributes the algorithm to Rem. Dijkstra is not known for being very careful in citing prior work, so maybe Rem is totally off.

Second, why is there a total order? (Isn't the lookup table supposed to represent a forest, whose descendent relation is a partial order?)
Good point! I should have said "partial order"! The total order concept is unduly constraining. Editing...

What are functors and natural transformations doing here?
The functor is just the look-up table, which as you noted represents a special graph.
The natural map just ensures the graph edges go from larger to smaller nodes, so that we are indeed in a forest.

Morgan Rogers (he/him) (May 03 2020 at 11:12):

Sam Tenka (naive student) said:

The key insight of Rem is to apply Morgan Rogers' rule on top of naive search: whenever we find that $G(x) = F^m(x) = y$ , then update the entries $F$ 's table so that $F^n(x) = y$ for all $n \leq m$ . This changes $F$ , but it preserves all of $F$ 's relevant properties including the goal $G$ . The benefit is that this potentially cuts out ``middlemen'' for the next query (on a different $x$ ). In fact, Tarjan showed that the running time, amortized over lots of queries, is big-O of the Inverse Ackermann Function --- nearly constant!!

This algorithm really blows my mind. Thanks for sharing!