Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: theory: applied category theory

Topic: The Algebraic Path Problem

Jade Master (May 15 2020 at 19:06):

I wrote a paper about computing the algebraic path problem on systems which are left open to their surroundings. I'm eager to get feedback on this...I'm new to computer science so I'm hoping that people here will have a perspective on this that I don't.

A bit about the paper: The algebraic path problem is a generalization of the shortest path problem which covers a lot of ground. For any commutative quantale $R$ (i.e. monoidal closed poset with all joins), the algebraic path problem computes the transitive closure of matrices weighted in $R$ . Here is a table showing some of the problems you can solve by choosing different quantales:

table.jpeg

To make the algebraic path problem open, you can use the cospan formalisms developed by Baez, Courser, and Fong. More specifically, there is a bicategory where morphisms are open $R$ -matrices and composition is gluing of these open $R$ -matrices. The transitive closure, and therefore the solution to the algebraic path problem on open $R$ -matrices, can be cast as a functor from this bicategory. I was very pleased to get the mapping to be functorial with respect to this gluing. In particular, functoriality gives an ismorphism between the solution of the algebraic path problem on a composite system to the solutions on its components. I hope that this isomorphism can inspire better algorithms for computing the algebraic path problem.

I also have a softer question. I find the conference/publishing culture of computer science to be very confusing. I'm wondering if anyone has a suggestion on a good place to submit a paper like this if my goal is to get a job in academic computer science.

sarahzrf (May 15 2020 at 19:09):

its cool

Gershom (May 15 2020 at 20:19):

Jade -- very nice! have you seen http://stedolan.net/research/semirings.pdf -- how close is a commutative quantale to a "closed semiring"?

On to the soft question. Here is my worth a grain of salt take. Other people might say "actually papers just like this are published in CS venue X" and they're probably right and I didn't know about X. But as is, it feels like its not yet pitched at a CS audience sufficiently. I note that one basic thing to do would be to take your formula in (3) and actually write out in pseudocode in your language of choice (or real code in haskell or something) how to actually perform this procedure. Otherwise people will say "you've shown me a mathematical result, but I don't actually see a procedure anywhere!"

Writing up things for CS is weird, in my observation, because I think most things have to be explained very explicitly, like to an undergraduate, but there are occasional things that if you explain them too explicitly people will feel "talked down" to, and it is perhaps hard to know which is which, until you see what reviewers yell at you about.

One key element is you need to choose if you want your paper to be "Theory A" or "Theory B": https://blog.computationalcomplexity.org/2003/03/theory-and-theory-b.html

One way I think of this is "Theory B" is more about "how can we describe/classify/translate problems we want to solve" (and sometimes "is this solvable at all") and "Theory A" is more about "ok, given a specification of a problem, how efficiently can we solve it".

The two cultures only have modest overlap, and people with CS-ties on a place like this (including myself) will tend towards Theory B. But most (not all!) CS academic jobs in the U.S. have tended towards Theory A in the past. Nowadays, I think there is also just "systems engineering" stuff which I can't really classify. My description here is a bit rude and ignorant, but it seems to me: "here's a real world problem, here's a smaller version, and we just like wrote a program and ran it a bit and it did some stuff, take a look." Sometimes it's "here's the actual real world problem, and working with BIGCO we actually did write some stuff to solve it, in some sense, maybe, and we can't show you the stuff, because it's proprietary, but we're going to make some claims about it anyway."

Anyway, your result points the way towards something that Theory A might appreciate, but it's not there yet. To make it acceptable at a Theory A venue I'd think you'd want some actual claim where you say "if the input stuff has this structure than we can give these improved complexity bounds" that's very crisp, and that would be the "point". For the more systemsy "Theory A" stuff you instead find some real-world dataset and run your algorithm and some competing ones over it, and show "we don't know exactly why the data has this structure, but in practice we did better." For "Theory B" you could instead say "let's take compositionality as fundamental, and now we prove this procedure works on tiny things, and that when you take pushouts of tiny things it carries through", so now we derive a "correct-by-construction" procedure for big things. Then, ideally, you code-it-up in Agda or something and the proof of the required property of the result is constructed inductively with the result itself, and that's considered of interest.

Other people will I'm sure have other and better ideas as to how one would streamline it into one of those slots.

Gershom (May 15 2020 at 20:21):

(lol one place that i thought of where this might fit more as-is is MSCS, but you already know all about it and just published there!)

Jade Master (May 15 2020 at 20:41):

Gershom said:

Jade -- very nice! have you seen http://stedolan.net/research/semirings.pdf -- how close is a commutative quantale to a "closed semiring"?

On to the soft question. Here is my worth a grain of salt take. Other people might say "actually papers just like this are published in CS venue X" and they're probably right and I didn't know about X. But as is, it feels like its not yet pitched at a CS audience sufficiently. I note that one basic thing to do would be to take your formula in (3) and actually write out in pseudocode in your language of choice (or real code in haskell or something) how to actually perform this procedure. Otherwise people will say "you've shown me a mathematical result, but I don't actually see a procedure anywhere!"

Writing up things for CS is weird, in my observation, because I think most things have to be explained very explicitly, like to an undergraduate, but there are occasional things that if you explain them too explicitly people will feel "talked down" to, and it is perhaps hard to know which is which, until you see what reviewers yell at you about.

One key element is you need to choose if you want your paper to be "Theory A" or "Theory B": https://blog.computationalcomplexity.org/2003/03/theory-and-theory-b.html

One way I think of this is "Theory B" is more about "how can we describe/classify/translate problems we want to solve" (and sometimes "is this solvable at all") and "Theory A" is more about "ok, given a specification of a problem, how efficiently can we solve it".

The two cultures only have modest overlap, and people with CS-ties on a place like this (including myself) will tend towards Theory B. But most (not all!) CS academic jobs in the U.S. have tended towards Theory A in the past. Nowadays, I think there is also just "systems engineering" stuff which I can't really classify. My description here is a bit rude and ignorant, but it seems to me: "here's a real world problem, here's a smaller version, and we just like wrote a program and ran it a bit and it did some stuff, take a look." Sometimes it's "here's the actual real world problem, and working with BIGCO we actually did write some stuff to solve it, in some sense, maybe, and we can't show you the stuff, because it's proprietary, but we're going to make some claims about it anyway."

Anyway, your result points the way towards something that Theory A might appreciate, but it's not there yet. To make it acceptable at a Theory A venue I'd think you'd want some actual claim where you say "if the input stuff has this structure than we can give these improved complexity bounds" that's very crisp, and that would be the "point". For the more systemsy "Theory A" stuff you instead find some real-world dataset and run your algorithm and some competing ones over it, and show "we don't know exactly why the data has this structure, but in practice we did better." For "Theory B" you could instead say "let's take compositionality as fundamental, and now we prove this procedure works on tiny things, and that when you take pushouts of tiny things it carries through", so now we derive a "correct-by-construction" procedure for big things. Then, ideally, you code-it-up in Agda or something and the proof of the required property of the result is constructed inductively with the result itself, and that's considered of interest.

Other people will I'm sure have other and better ideas as to how one would streamline it into one of those slots.

Thanks Gershom. I hadn't seen that pdf but it looks great. A (commutative?) quantale can be turned into a closed semiring with join as it's +. The * operation is given by the free monoid in the preorder of your quantale.

Jade Master (May 15 2020 at 20:45):

Also what your saying about Theory A and Theory B is very helpful. It wasn't a distinction I was really familiar with and it's definitely something I should know about. Your suggestion and sketch for a more Theory A version of this paper sounds good...I definitely want to see how applied I can go.

John Baez (May 15 2020 at 20:47):

I don't think you can become a Theory A person very quick, Jade. I know about this distinction, and everything we do is heavily Theory B (or maybe Theory C if such a thing exists).

Gershom (May 15 2020 at 20:48):

I and perhaps many other people find "genuine" Theory A a bit scary and imposing. But it isn't all like NP complexity stuff. Some of it really is closer to e.g. Tarjan's papers on graph algorithms. It's just the style and formalisms used are a different sort of rigorous.

John Baez (May 15 2020 at 20:49):

One origin of Jade's paper was that I was trying to come up with a really practical application of the "network models" that @Joe Moeller and I had been coming up with. I was trying to use them to help solve shortest-path problems.

John Baez (May 15 2020 at 20:50):

We can think of Jade's idea as a rigorous working-out of the basic idea - though she's not using network models, rather structured cospans.

John Baez (May 15 2020 at 20:50):

However, I think the methodology will only become practical when combined with other ideas.

John Baez (May 15 2020 at 20:51):

I keep talking about these other ideas (to my gang of grad students), but we haven't really put any time into thinking about them in detail.

John Baez (May 15 2020 at 20:51):

One reason is that to pursue them I think we need to really forget about category theory for a while.

Jade Master (May 15 2020 at 20:52):

You're right that it would be hard to become a Theory A person very quickly. I was more thinking that it would be good to be a Theory B person with more of a slant towards Theory A. I think this would be a good strategy because Theory A is where the funding is apparently.

Jade Master (May 15 2020 at 20:54):

Also Gershom. I'm not sure I understand your idea about Agda completely. Is the idea that it would be about designing systems rather than analyzing them?

Jade Master (May 15 2020 at 21:01):

John Baez said:

However, I think the methodology will only become practical when combined with other ideas.

Which other ideas do you mean?

John Baez (May 15 2020 at 21:04):

I've said 'em over and over.

1) One is the idea of not trying to get the shortest path, just a "reasonably good" path. That'll let you use approximations to your monad instead of the full-fledged monad.

John Baez (May 15 2020 at 21:05):

2) Another is to imagine data that keeps being updated over time. So, you have a big graph written as a composite of smaller open graphs, and the edges are weighted by elements of some commutative quantale... and let's say you already have an approximate solution to the algebraic path problem for the whole big graph... but then someone updates the edge weights on one of the smaller open graphs!

John Baez (May 15 2020 at 21:06):

How do you update your solutions for the whole big graph?

John Baez (May 15 2020 at 21:06):

Again, we don't want to compute the exact answer... that'll take a long time. But there may be some approximate thing we can do, that's fast.

John Baez (May 15 2020 at 21:08):

3) Another is to imagine "black-boxing" one of the smaller open graphs, so all we record about it is the shortest path from each point on its "boundary" to each other point on its "boundary". This may let us keep track of a lot less information, if all we care about is how to get through this region, not move around inside it.

sarahzrf (May 15 2020 at 21:16):

Jade Master said:

You're right that it would be hard to become a Theory A person very quickly. I was more thinking that it would be good to be a Theory B person with more of a slant towards Theory A. I think this would be a good strategy because Theory A is where the funding is apparently.

just move to france :kiss:

sarahzrf (May 15 2020 at 21:18):

John Baez said:

How do you update your solutions for the whole big graph?

ah, incremental computing is a whole Thing :)

sarahzrf (May 15 2020 at 21:18):

@Jesse Sigal showed me a paper about it a little while back which was doing a thing that i noticed was a special case of internal categories

sarahzrf (May 15 2020 at 21:19):

i have some thoughts about that... but maybe ill talk about em sometime elsewhere.

Jade Master (May 15 2020 at 21:35):

John Baez said:

2) Another is to imagine data that keeps being updated over time. So, you have a big graph written as a composite of smaller open graphs, and the edges are weighted by elements of some commutative quantale... and let's say you already have an approximate solution to the algebraic path problem for the whole big graph... but then someone updates the edge weights on one of the smaller open graphs!

Thanks. I remembered these ideas a bit, I just wasn't sure this is what you meant. I think that this might be related to 2-functoriality of the mapping
$\mathsf{Open(Mat_R)} \to \mathsf{Open(IdMat_R)}$ . Changes in your components can be thought of as 2-morphisms on the components. Those get mapped to changes in their solutions. The 2-dimensional aspect of functoriality guarantees that you can glue these small changes together to get a change on the whole graph.

Gershom (May 15 2020 at 21:48):

Jade Master said:

Also Gershom. I'm not sure I understand your idea about Agda completely. Is the idea that it would be about designing systems rather than analyzing them?

Well you have a functor, right, that sends an R-matrix to an idempotent R matrix that is in some sense (co?)universal with regards to the one you started with and the "algebraic path problem" is the computation of this in any specific case, right? So you want to write a function in agda, x : RMat -> (y : IRMat, proof of co-universality of y w/r/t x ). Now how do you do that? One way is you write a function that works only for very small things, which you can do exhaustively by cases. Then, for a bigger thing, you decompose it into the small things, run it on the small things, and compose the resultant matrixes and proofs back up. This is often how formalized/mechanized proofs of correctness of inductive algorithms work. Your result seems to suggest that this can be done for the shortest paths problem.

so your new algorithm is maybe not better with regards to complexity, but it does come with a certified proof, and that can be useful, and constructions like that are often able to be written up and presented, because they advance our understanding of how to do "old things" in a "nice, new, proof carrying way". Just like sometimes if you write something "old" up with categorical methods it is of interest to some people for that reason alone, because it is more generalized and useful, so too with something like this.

Hope that makes sense?

John Baez (May 15 2020 at 22:02):

Jade Master said:

Thanks. I remembered these ideas a bit, I just wasn't sure this is what you meant. I think that this might be related to 2-functoriality of the mapping

$\mathsf{Open(Mat_R)} \to \mathsf{Open(IdMat_R)}$ .

Changes in your components can be thought of as 2-morphisms on the components. Those get mapped to changes in their solutions. The 2-dimensional aspect of functoriality guarantees that you can glue these small changes together to get a change on the whole graph.

That's great! But we also need to start thinking like Type A personalities, in terms of algorithms and their runtimes. Do you get how Floyd-Warshall works and why it has runtime $O(|V|^3)$ ? I tend to forget, but Wikipedia explains it really well.

John Baez (May 15 2020 at 22:04):

So, we need to think about how, after you update a little "patch", you're going to make these updates propagate across the whole graph, in a very nice efficient way.

John Baez (May 15 2020 at 22:05):

The Floyd-Warshall algorithm is not sneaky, but it intelligently avoids unnecessary labor.

John Baez (May 15 2020 at 22:07):

I guess we should think about the equation on Wikipedia after "and the recursive case is". This is "the heart of the Floyd-Warshall algorithm", but it's also something about how you compute the free idempotent matrix on a matrix.

Jade Master (May 15 2020 at 22:14):

Hm. We should read this paper maybe

John Baez (May 15 2020 at 22:15):

I prefer to just think....

John Baez (May 15 2020 at 22:15):

But yeah, that looks like a fun paper.

John Baez (May 15 2020 at 22:16):

I really want you to practice thinking up algorithms. (And me too!!!)

John Baez (May 15 2020 at 22:16):

It's not good to get stuck in thinking like a category theorist all the time.

John Baez (May 15 2020 at 22:17):

It's an amazing ability - to be able to think like a category theorist - but if you get trapped in doing it all the time it's crippling.

John Baez (May 15 2020 at 22:18):

(Maybe I shouldn't be a thesis advisor in public, but oh well. Jade is great; I'm just trying to make her even greater.)

sarahzrf (May 15 2020 at 22:21):

ive been making her learn haskell so that should be helpful for this :halo:

Jade Master (May 15 2020 at 22:22):

Haha true. And you're right John.

John Baez (May 15 2020 at 22:25):

Good! I can't program, so getting someone to teach you programming is great. But I think I might be able to help you dream up some algorithms. I think staring at the "the heart of the Floyd-Warshall algorithm" and really understanding it would be a key step.

John Baez (May 15 2020 at 22:26):

You see, we might naively turn a matrix M into an idempotent matrix by computing

$1 \vee M \vee M^2 \vee M^3 \vee ... \vee M^k$

John Baez (May 15 2020 at 22:27):

where $k$ is the length of the longest possible non-self-crossing path in our graph.

John Baez (May 15 2020 at 22:29):

If you don't do any sneaky tricks, computing $M^2$ involves $n^3$ "multiplications" when our matrix is $n \times n$ .

John Baez (May 15 2020 at 22:29):

I'm writing "multiplications" in quotes because we're in a crazy rig where multiplication is addition.

John Baez (May 15 2020 at 22:30):

See why we need $n^3$ if we don't do anything clever?

John Baez (May 15 2020 at 22:30):

And that's just for computing $M^2$ ! So computing $1 \vee M \vee M^2 \vee \cdots \vee M^k$ is gonna be a lot more work.

John Baez (May 15 2020 at 22:31):

But Floyd-Warshall says: no, you just need $n^3$ steps!

John Baez (May 15 2020 at 22:31):

By the way, $k$ is about $n$ in the worst case scenario.

John Baez (May 15 2020 at 22:32):

So anyway, you should think of Floyd-Warschall as a clever way of computing $1 \vee M \vee M^2 \vee \cdots \vee M^k$ with a lot fewer "multiplications", and you should understand why it works, and you should use the same sort of clever idea in other algorithms.

John Baez (May 15 2020 at 22:45):

I predict that there's a category-theoretic "heart" of the Floyd-Warshall algorithm, which is some nice way to understand how it computes $1 \vee M \vee M^2 \vee \cdots \vee M^k$ so quickly. It's something about what happens when you add one extra point to the vertex set of your graph: it uses this idea recursively.

Jade Master (May 15 2020 at 23:03):

John Baez said:

I predict that there's a category-theoretic "heart" of the Floyd-Warshall algorithm, which is some nice way to understand how it computes $1 \vee M \vee M^2 \vee \cdots \vee M^k$ so quickly. It's something about what happens when you add one extra point to the vertex set of your graph: it uses this idea recursively.

I think this adding of an extra point to the vertex set of your graph might be a pushout :thinking:

Jade Master (May 15 2020 at 23:03):

I know you said you didn't want more category theory whoops.

John Baez (May 15 2020 at 23:07):

Right, I don't. Not until after we have an actual idea.

John Baez (May 15 2020 at 23:10):

If you look at the algorithm you'll see we've got a graph on a bunch of vertices $\{1,\dots,n\}$ but we only consider paths from $i$ to $j$ that uses the vertices $\{1, \dots, k\}$ (and of course $i$ and $j$ ). So we're "restricting" the graph to the set $\{1, \dots, k, i, j \}$ .

John Baez (May 15 2020 at 23:10):

Call this restricted graph $G_k$ .

John Baez (May 15 2020 at 23:11):

So, we assume we know the shortest path from $i$ to $j$ in the graph $G_k$ , and then - using the "heart" of the algorithm - we find the shortest path from $i$ to $j$ in the graph $G_{k+1}$ .

John Baez (May 15 2020 at 23:12):

So when I say "adding an extra point" I mean going from $G_k$ to the slightly larger graph $G_{k+1}$ .

John Baez (May 15 2020 at 23:12):

But really the main operation we're using is our ability to take a graph $G$ and "restrict" it, getting graphs

$G_1 \hookrightarrow G_2 \hookrightarrow G_3 \hookrightarrow \cdots$

Jade Master (May 16 2020 at 03:16):

Right. It kind of reminds me of a filtration...like in persistent homology.

Jade Master (May 16 2020 at 03:24):

One maybe useful generalization could be to make the graphs in the sequence increase by more than one vertex at a time

John Baez (May 16 2020 at 05:10):

Wouldn't that make the computation harder?

Rich Hilliard (May 16 2020 at 13:45):

The topic seems appropriate to Journal of ACM.

I think you'd want to add an opening [no pun intended] section discussing classes of problems the open formulation might address; and a final section with CS nitty-gritty: algorithm as @Gershom suggested, an example or two, complexity, etc. Some of which you have in the conclusion.

Minor nit: I don't think anyone considers dynamic programming [a la Bellman] "In software engineering"; the point stands without that phrase.

Sam Tenka (May 17 2020 at 05:31):

John Baez said:

You see, we might naively turn a matrix M into an idempotent matrix by computing

$1 \vee M \vee M^2 \vee M^3 \vee ... \vee M^k$

It might be useful also to compare Floyd-Warshall to a slight refinement of this time- $n^4$ baseline. In particular, a standard trick (c.f. Savitch's Theorem's proof) shows how to turn this into $n^3 \log(n)$ : just iterate the map $M \mapsto M \vee M^2$ , starting from $I \vee M$ , until convergence, which will happen in at most $\sim \log(n)$ steps!

John Baez (May 18 2020 at 20:50):

Thanks! This is the kind of "standard trick" I'm completely unaware of, since I've never really studied algorithms.

James Fairbanks (May 18 2020 at 21:16):

@Jade Master this is a great paper, for CS venues, the most directly interested people would be those associated with the GraphBlas community. Here is a standards document about a standard for graph theoretic programming paragdigm centered around semirings and algebraic paths. http://www.mit.edu/~kepner/GraphBLAS/GraphBLAS-Math-release.pdf this community grew out of the high performance computing community as an effort to extend the Basic Linear Algebra Subroutines (BLAS) which support pretty much all HPC and scientific computing workloads (think Matlab and Numpy style programming) to more general purpose graph theoretic computations. That community would expect that your technique produces an accelerated version of FW for arbitrary quantales with an implementation and benchmarking to prove that it is in fact faster.

James Fairbanks (May 18 2020 at 21:23):

characterizing the set of problems where this approach works would be interesting from a theoretical perspective and the people in the graph blas community would welcome it, particularly John Gilbert at UCSB, but they are in general about getting algorithms to go faster on bigger systems.

Jade Master (May 18 2020 at 22:24):

Rich Hilliard said:

The topic seems appropriate to Journal of ACM.

I think you'd want to add an opening [no pun intended] section discussing classes of problems the open formulation might address; and a final section with CS nitty-gritty: algorithm as Gershom suggested, an example or two, complexity, etc. Some of which you have in the conclusion.

Minor nit: I don't think anyone considers dynamic programming [a la Bellman] "In software engineering"; the point stands without that phrase.

Thank you for the advice and your note. It is helpful :smile:

Jade Master (May 18 2020 at 22:26):

James Fairbanks said:

Jade Master this is a great paper, for CS venues, the most directly interested people would be those associated with the GraphBlas community. Here is a standards document about a standard for graph theoretic programming paragdigm centered around semirings and algebraic paths. http://www.mit.edu/~kepner/GraphBLAS/GraphBLAS-Math-release.pdf this community grew out of the high performance computing community as an effort to extend the Basic Linear Algebra Subroutines (BLAS) which support pretty much all HPC and scientific computing workloads (think Matlab and Numpy style programming) to more general purpose graph theoretic computations. That community would expect that your technique produces an accelerated version of FW for arbitrary quantales with an implementation and benchmarking to prove that it is in fact faster.

Ah thank you for the recommendations. I will try to contact some of these people.

Jade Master (May 18 2020 at 22:26):

I would like to say that I don't have an algorithm yet, just a (vague) suggestion about how to make one.

John Baez (May 18 2020 at 22:33):

When it comes to algorithms, the remark by @Sam Tenka sounds important!

Jade Master (May 18 2020 at 22:38):

Yes. I am still working on absorbing it.