Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: practice: communication

Topic: Improving nLab

Henry Story (May 27 2020 at 15:11):

NLab is a great resource, but as everything can be improved, I thought it would be good to have a thread on it.
Two ideas to start:

A lot of links are dead on nlab. It would be useful to have a crawler verify links regularly. Indeed it would be useful for any object linked to, to be saved first, and in case a link breaks the cached version could be linked to.
This is a general feature that any web server should have, as I argue at the end of the blog post co-immunology and the Web
On another thread a couple of days ago, it occurred to me that it would be useful to have a map of how different CT concepts have ended up being used in the world. We found that petri-nets were used in Scala which is very helpful to know.

(=_=) (May 27 2020 at 15:19):

You may want to cc the Steering Committee, although idea 1 sounds really technical and I have no idea who the sysadmin is now. It was Andrew Stacey, but I'm not sure if it's still him now.

(=_=) (May 27 2020 at 15:22):

Idea 2 is a nice idea, but would require some visualisation, and I don't know how much Instiki is capable of doing.

Henry Story (May 27 2020 at 15:27):

I guess if this were added on the wiki in a way so as to make the data extractable people could slowly add such information to each page, and then later extract the data when enough was available to build a visualization.

(=_=) (May 27 2020 at 15:33):

It's just markdown, so adding a section (with a standardised name) for real-world applications would probably do the trick.

Todd Trimble (May 27 2020 at 15:37):

The sysadmin is Richard Williamson. I'm on the Steering Committee and can bring it up.

(=_=) (May 27 2020 at 15:43):

Yeah, I know David, but he's pretty busy these days.

Mike Shulman (May 27 2020 at 15:48):

I think anyone who wants to start adding application sections would be welcome to do so.

Todd Trimble (May 27 2020 at 16:31):

So @Henry Story , I did hear back from two other Steering Committee members, and there is general openness to the crawler idea, but I don't know anything about this, such as how to even get started. Do you have any experience with getting such a crawler installed?

Henry Story (May 27 2020 at 16:36):

I am in the middle starting a company to get a part time job to pay for my PhD. But it would be relatively easy to write one in Scala (for me) using akka. Nevertheless like any software project it does require work - writing code, testing, ... I think I need it for the project I am working on (hence my blog post). So at some point I'll be writing one. It would be then some work (but less) to adapt it to the nlab.

Alexis Hazell (May 29 2020 at 07:27):

I'm technically an 'assistant system administrator' for the nLab; I say 'technically' because I only came on board at the end of last year, and haven't yet contributed much (for reasons I'll explain below). But I can at least offer some initial thoughts. :-)

Richard Williamson is the current sysadmin for the nLab. He's currently working hard on a rewrite of the nForum, part of which involves writing a new Markdown parser. The current priority is to move the nLab to the cloud, and Richard feels this rewrite is a blocker for that. Once the rewrite is done, however, the move to the cloud will present opportunities for people to help out.

As a result of the preceding, Richard currently doesn't have much time available to onboard people wishing to help out with nLab stuff. There are several small changes I started working on, but I reached a point where I needed further assistance from Richard, and his time constraints have meant he's been unable to do so. So I've not done much more than occasionally dealing with nLab spam.

From my perspective, the nLab software needs administrative re-organisation to make it easier for others to contribute. For example, development is effectively not under version control - the GitHub repos are not actively used (via branches, GitHub forks, issues, pull requests etc.), but are essentially snapshot mirrors.

One other thing I'd like to mention is that nLab currently makes use of Ruby (via Instiki and Ruby on Rails) and Python, and I seem to remember Richard's new parser might be in another language again, maybe C? My personal preference is that we try to avoid adding further languages to the mix, as the more languages involved, the more complex the entire system will be to maintain.

(=_=) (May 29 2020 at 07:30):

Alexis Hazell said:

One other thing I'd like to mention is that nLab currently makes use of Ruby (via Instiki and Ruby on Rails) and Python, and I seem to remember Richard's new parser might be in another language again, maybe C?

From a certain point of view, this is :scream:.

sarahzrf (Jun 03 2020 at 08:19):

if you have a point of view from which it's not :scream:, i would be morbidly curious to know what it is

sarahzrf (Jun 03 2020 at 08:20):

i just wanna know why the search is so slow

sarahzrf (Jun 03 2020 at 08:23):

image.png

sarahzrf (Jun 03 2020 at 08:23):

this better not be doing what i think it's doing

Oscar Cunningham (Jun 03 2020 at 08:49):

What do you think it's doing?

Oscar Cunningham (Jun 03 2020 at 08:50):

I don't even know what language that is, but the nLab search is really slow so if you think they've made a horrific mistake then I believe you

sarahzrf (Jun 03 2020 at 08:59):

it's loading a list of every page from the database and then searching through them in the web app

sarahzrf (Jun 03 2020 at 08:59):

instead of letting the database do the searching

sarahzrf (Jun 03 2020 at 08:59):

unless i've misunderstood the code, which is possible—but i did poke around a bit to make moderately sure

sarahzrf (Jun 03 2020 at 09:00):

sorry, not just a list of every page, but the content of every page, so that it can scan through all of it

Alexis Hazell (Jun 03 2020 at 09:01):

It's Ruby - more specifically, Ruby on Rails.

sarahzrf (Jun 03 2020 at 09:02):

(rails is a framework, the language is still ruby)

Alexis Hazell (Jun 03 2020 at 09:02):

Yes, I'm aware. :-)

sarahzrf (Jun 03 2020 at 09:03):

sorry :zip_it:

Oscar Cunningham (Jun 03 2020 at 09:03):

I see. So I'm downloading a copy of the entire nLab (minus pictures) every time I do a search :scream: (EDIT: No)

Alexis Hazell (Jun 03 2020 at 09:03):

I don't know what RoR does behind the scenes with this, as I only started learning it last year when I came on board, then haven't looked into it any further for the reasons I described above, plus all the other commitments in my life.

sarahzrf (Jun 03 2020 at 09:04):

well, you aren't, but the ruby process is downloading a copy from the database every time, as far as i can tell

Alexis Hazell (Jun 03 2020 at 09:04):

RoR might translate such code directly into db queries for all I know.

sarahzrf (Jun 03 2020 at 09:04):

and they're probably either on the same machine or on a very fast network with each other, so that's waaaaaaaay faster than downloading the whole nlab over the internet

sarahzrf (Jun 03 2020 at 09:04):

nah alexis i poked around in the code to make sure

sarahzrf (Jun 03 2020 at 09:05):

im pretty confident this is what it's doing

sarahzrf (Jun 03 2020 at 09:05):

one sec

Alexis Hazell (Jun 03 2020 at 09:05):

They're on the same server.

sarahzrf (Jun 03 2020 at 09:05):

ahh, you're one of the admins :o

Alexis Hazell (Jun 03 2020 at 09:06):

Well, as I wrote above, 'technically'. :-)

Alexis Hazell (Jun 03 2020 at 09:07):

I don't know how much the hardware underlying the current server is impacting performance, but since Richard is prioritising getting things ready for a move to cloud servers, I suspect it might be.

sarahzrf (Jun 03 2020 at 09:08):

so the select method being called on @web is, i am pretty willing to guess from strong context clues, this:
image.png

sarahzrf (Jun 03 2020 at 09:08):

...not that name resolution in ruby is actually possible to do statically, but we try.

sarahzrf (Jun 03 2020 at 09:09):

then: image.png

sarahzrf (Jun 03 2020 at 09:10):

so when you say @web.select { |page| ... }, that ends up just @web.pages.select {|page| ... } and tossing them into this PageSet wrapper class instead of an Array

sarahzrf (Jun 03 2020 at 09:10):

roughly speaking

sarahzrf (Jun 03 2020 at 09:10):

and this is where .pages comes from on @web:
image.png

sarahzrf (Jun 03 2020 at 09:12):

browsing the activerecord docu makes it look like methods generated by has_many will produce some kind of instance of this https://api.rubyonrails.org/classes/ActiveRecord/Relation.html

sarahzrf (Jun 03 2020 at 09:12):

and that folds in Enumerable, so select is gonna enumerate the whole thing

sarahzrf (Jun 03 2020 at 09:13):

its possible ive confused myself and this is actually a https://api.rubyonrails.org/classes/ActiveRecord/Result.html but that's also folding in Enumerable

sarahzrf (Jun 03 2020 at 09:13):

so either way, that select is happening on the ruby end

EDIT: i realized later the select is actually probably coming from https://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#method-i-select (which that documentation completely fails to show, as ironic punishment for me because i mocked the impossibility of statically resolving names in ruby above)

but that does in fact appear to delegate back to Enumerable when called with a block, so. same conclusion

sarahzrf (Jun 03 2020 at 09:13):

which means the only way it can be getting all the stuff it needs to be looking at is if @web.pages really is just... grabbing the entire site

sarahzrf (Jun 03 2020 at 09:15):

obviously there are any number of places where i couldve missed something here, but it seems to fit together as far as i can tell, and it would certainly explain why the search function is so incredibly slow

sarahzrf (Jun 03 2020 at 09:15):

but it's 5:15am here and hooooly crap i should be asleep :sob:

Alexis Hazell (Jun 03 2020 at 09:18):

:-) Your analysis certainly seems reasonable but to me, but yeah, I only have a passing familiarity the system (and RoR) at this point. But it wouldn't surprise me if the nLab hadn't significantly outgrown the sort of use-case that RoR might have been designed for.

sarahzrf (Jun 03 2020 at 09:18):

oh that has nothing to do with it :p

sarahzrf (Jun 03 2020 at 09:18):

you can just ask the db to match the regex instead of ruby

sarahzrf (Jun 03 2020 at 09:19):

you just need to fiddle the code around a little so that it's constructing a query that passes the regex to the db, instead of doing a full-site query and then filtering it ruby-side

sarahzrf (Jun 03 2020 at 09:19):

ruby's regex engine probably has more features than the db's, but i feel like that's a small thing to lose

sarahzrf (Jun 03 2020 at 09:20):

im going to close this tab for now before i get sucked into a convo about this

sarahzrf (Jun 03 2020 at 09:20):

i need. sleep

Alexis Hazell (Jun 03 2020 at 09:22):

Fair enough. :-) But for when you come back later: could perhaps mention all this in the "Bugs and feature requests" discussion in the nForum? That way this issue is less likely to be forgotten. (Oh how I wish we were using an actual issue tracker ....)

Alexis Hazell (Jun 03 2020 at 09:30):

More generally, people might like to check out the "Technical TODO list", which describes a number of things that need to be addressed. :-)

(=_=) (Jun 03 2020 at 10:09):

sarahzrf said:

if you have a point of view from which it's not :scream:, i would be morbidly curious to know what it is

From the point of view of "C is faster than Ruby and Python, so we should use C".

Well done on the code review, by the way!

Alexis Hazell (Jun 03 2020 at 10:16):

Well, I don't remember the exact language Richard is writing the new parser in (it might not have been C, but Go, or something else entirely), but at any rate, his work is not, afaik, intended to address nLab search speed .... At any rate, I've just sent him an invite to join this discussion.

(=_=) (Jun 03 2020 at 10:19):

Alexis Hazell said:

Well, I don't remember the exact language Richard is writing the new parser in (it might not have been C, but Go, or something else entirely), but at any rate, his work is not, afaik, intended to address nLab search speed

Sorry, it was just a guess. But I know there's a point of view that thinks that Python and Ruby cannot possibly be faster than lower-level languages like C or Go, and since the slow search came up in this convo as an issue, that became my hunch. I'm sure Richard has his own reasons for choosing a new language, and I'm hoping his experience in industry is a sign of good things to come.

(=_=) (Jun 03 2020 at 12:42):

Alexis Hazell said:

He's currently working hard on a rewrite of the nForum, part of which involves writing a new Markdown parser. The current priority is to move the nLab to the cloud, and Richard feels this rewrite is a blocker for that. [...]

One other thing I'd like to mention is that nLab currently makes use of Ruby (via Instiki and Ruby on Rails) and Python, and I seem to remember Richard's new parser might be in another language again, maybe C?

So here's something that's confusing me. There are a few open-source markdown parsers written in Ruby or Python out there, some of which are allegedly fast. Since this rewrite has been such a blocker for moving the nLab to the cloud, what have been the difficulties in re-purposing one of these pre-existing parsers for the nForum? Or was that what Richard's been trying to do?

Alexis Hazell (Jun 03 2020 at 13:08):

Sorry, I don't know the details, apart from the parser being only one aspect of an overall rewrite of the nForum code (currently in PHP), which Richard has said is what's needed for the move. Hopefully Richard will join this discussion and fill in those details. :-)

John Baez (Jun 03 2020 at 18:25):

By the way, @David Tanzer has set up a wiki sort of like the nLab and very like the Azimuth wiki - but for the ACT community.

John Baez (Jun 03 2020 at 18:25):

He'll open it up to us pretty soon.

John Baez (Jun 03 2020 at 18:26):

I hope we use it, and I hope some of you programmers work on making it great!

David Tanzer (Jun 03 2020 at 18:51):

@John Baez Indeed! Once it's launched, we can use a separate stream for planning and detailed discussions about content organization and page contents. Hope to see people there!

Jules Hedges (Jun 03 2020 at 18:56):

Amazing! I talked about this a lot a year or 2 ago, and then dropped it because noone else seemed to care as much as me

John Baez (Jun 03 2020 at 19:01):

It's a kind of coincidence, @Jules Hedges - David was trying to get the Azimuth Wiki to use TikZ, and it turned out to be easier to set up a new wiki that's similar in format to the Azimuth Wiki but uses TikZ.

Jules Hedges (Jun 03 2020 at 19:03):

Yay! I'll definitely be on board with this

Paolo Perrone (Jun 03 2020 at 20:45):

I think we could both put content on the nLab, and in new wikis, provided that we give links to one another when it's relevant, so that we don't have many small projects. After all, as applied category theorists we know very well that networks can be glued :)

Henry Story (Jun 03 2020 at 20:56):

Guys, if you are going to write an nLab web site, do it in an advanced typesafe language! Something where you can learn/use Category Theory while programming! Not Ruby.
:duck:

(=_=) (Jun 03 2020 at 23:33):

Alexis Hazell said:

Sorry, I don't know the details, apart from the parser being only one aspect of an overall rewrite of the nForum code (currently in PHP), which Richard has said is what's needed for the move. Hopefully Richard will join this discussion and fill in those details. :-)

Sorry to put you on the spot, it's just that the question occurred to me as I was reviewing this thread. Yeah, let's wait for him to join this.

sarahzrf (Jun 04 2020 at 01:13):

Rongmin Lu said:

Well done on the code review, by the way!

little can motivate me like the conviction that i have found a mistake in someone else's code :upside_down:

(=_=) (Jun 04 2020 at 04:56):

sarahzrf said:

little can motivate me like the conviction that i have found a mistake in someone else's code :upside_down:

Ah yes, the "somebody on the internet is wrong" effect. :upside_down:

sarahzrf (Jun 04 2020 at 08:44):

John Baez said:

It's a kind of coincidence, Jules Hedges - David was trying to get the Azimuth Wiki to use TikZ, and it turned out to be easier to set up a new wiki that's similar in format to the Azimuth Wiki but uses TikZ.

i wonder if it would be possible to get tikzit running on emscripten and integrate that with such a wiki somehow :thinking:

sarahzrf (Jun 04 2020 at 08:44):

i think that might make a really nice ui

sarahzrf (Jun 04 2020 at 08:48):

otoh, tikzit isn't huge, so it might be more expedient to just try to clone it for the web directly

sarahzrf (Jun 04 2020 at 08:49):

...is what i would say if i had momentarily forgotten how easy it is to underestimate how much work a given thing would be to develop.

sarahzrf (Jun 04 2020 at 08:49):

Jules Hedges (Jun 04 2020 at 09:45):

sarahzrf said:

i wonder if it would be possible to get tikzit running on emscripten and integrate that with such a wiki somehow :thinking:

Stole my idea.... I didn't suggest it now someone is doing actual work, since I assumed it would be really hard

Jules Hedges (Jun 04 2020 at 09:46):

For writing it makes little difference, if it supports TikZ then you can use TikZit offline and just copy the compiled code in. Integrated TikZit would be really amazing for collaborative editing of diagrams

Kenji Maillard (Jun 04 2020 at 10:19):

sarahzrf said:

i wonder if it would be possible to get tikzit running on emscripten and integrate that with such a wiki somehow :thinking:

Not emscripten, but not that far off either using web-assembly: http://tikzjax.com/

sarahzrf (Jun 04 2020 at 21:36):

wait, tex is written in pascal? image.png

sarahzrf (Jun 04 2020 at 21:36):

TIL

sarahzrf (Jun 04 2020 at 21:37):

Jules Hedges said:

For writing it makes little difference, if it supports TikZ then you can use TikZit offline and just copy the compiled code in. Integrated TikZit would be really amazing for collaborative editing of diagrams

hmm, remind me, to what extent can tikzit open tikz code? like, doesn't it save its diagrams as basically just tikz?

Jules Hedges (Jun 04 2020 at 21:41):

I think it has its own file format (but I don't use it, sad laptop)

Matteo Capucci (he/him) (Jun 05 2020 at 08:05):

sarahzrf said:

wait, tex is written in pascal? image.png

this is one of the most nightmarish sentences i've read

Ben Steffan (Jun 05 2020 at 08:10):

Matteo Capucci said:

sarahzrf said:

wait, tex is written in pascal? image.png

this is one of the most nightmarish sentences i've read

oh buckle in then, because that's actually not true

Ben Steffan (Jun 05 2020 at 08:10):

TeX is written in WEB

Ben Steffan (Jun 05 2020 at 08:11):

Which then compiles to Pascal

Ben Steffan (Jun 05 2020 at 08:13):

If you're now wondering what WEB is: it's a programming language invented by Knuth himself, and he's basically the only one to ever write programs in it

Matteo Capucci (he/him) (Jun 05 2020 at 08:41):

how does this improve the situation

Ben Steffan (Jun 05 2020 at 08:45):

Well, basically WEB is an experiment in literate programming: you essentially write documentation that has the code of your actual program interspersed

Ben Steffan (Jun 05 2020 at 08:45):

WEB then compiles the documentation to TeX and the actual code to Pascal

Ben Steffan (Jun 05 2020 at 08:46):

All of this is ancient in terms of the age of computers, of course.

Ben Steffan (Jun 05 2020 at 08:46):

This implementation was called "WEB" by Knuth since he believed that it was one of the few three-letter words of English that hadn't already been applied to computing.

This is a quote from the wiki page.

Matteo Capucci (he/him) (Jun 05 2020 at 08:50):

Oh I see. Cool then!

Jules Hedges (Jun 05 2020 at 10:28):

What does everyone have against Pascal? It was state of the art when it was created, it had the wisdom of Dijkstra built in

Kenji Maillard (Jun 05 2020 at 11:02):

Jules Hedges said:

What does everyone have against Pascal? It was state of the art when it was created, it had the wisdom of Dijkstra built in

the use of past tense in your phrase looks extremely relevant to the overall reaction of people

Kenji Maillard (Jun 05 2020 at 11:07):

anyway TeX is one of the greatest tool I'm using, but I think it's somewhat poor from the PL perspective: I never understood how to write things in a modular way, like local commands for instance; even packages break each other and my usual solution is to look up on TeX-exchange some incantation magically putting them in the right order...

Jules Hedges (Jun 05 2020 at 11:14):

In my opinion TeX is a failed experiment in language design, it demonstrates quite well that building a language with macroexpansion as its model of computation is a nightmare in practice

Jules Hedges (Jun 05 2020 at 11:15):

Specifically, we accept now that having errors is the default path during compilation, whereas a successful compilation is the exceptional path. And macroexpansion based computation is really, really bad for reporting errors

Jules Hedges (Jun 05 2020 at 11:17):

I guess it comes from a time when everyone was still in love with Gödel and Quine and Lisp, and doing weird shenanigans with syntax. Eventually everyone got over that and every reasonable language now respects lexical transparency

Ben Steffan (Jun 05 2020 at 11:21):

Bold of you to assume people ever got over Lisp.

Ben Steffan (Jun 05 2020 at 11:22):

Also TeX is clearly not failed in any practical sense; on the contrary, it's obviously hugely successful :)

Jules Hedges (Jun 05 2020 at 11:27):

Yep, Cobol is successful too

Ben Steffan (Jun 05 2020 at 11:31):

Not really

Ben Steffan (Jun 05 2020 at 11:32):

Cobol is basically dead. At least I would be very surprised if there was anybody who developed new software in Cobol. Its only use is keeping software alive that was written a long time ago.

Jules Hedges (Jun 05 2020 at 11:33):

Ok fair, Fortran is a better comparison, lots and lots of new software is written in Fortran because it has the best libraries for one important job

Ben Steffan (Jun 05 2020 at 11:33):

Yes, agree

John Baez (Jun 05 2020 at 16:40):

I will probably use LaTeX until I die, having invested so much time in learning it. But younger folks should invent something better.

Literally, I will probably die while debugging some LaTeX.

Jules Hedges (Jun 05 2020 at 18:53):

"Making a better LaTeX" would be the side project to end all side projects. You could probably spend your whole career on it and still not make something that looks as good

Eduardo Ochs (Jun 05 2020 at 19:34):

There's LuaLaTeX - it's quite easy to use it to implement new little languages for drawing diagrams, for example.

Fabrizio Genovese (Jun 05 2020 at 19:59):

Jules Hedges said:

"Making a better LaTeX" would be the side project to end all side projects. You could probably spend your whole career on it and still not make something that looks as good

I feel that something like $\LaTeX$ is either built impromptu within a weekend or never.

Fabrizio Genovese (Jun 05 2020 at 19:59):

Javascript has a similar story, it was developed without much thinking over 10 days

Jules Hedges (Jun 05 2020 at 20:30):

Sure the core programming language could be hacked together in a weekend and be immediately a big improvement, but the typesetting algorithms would be years of work, and getting enough libraries to actually write papers would probably take decades for a team of 1

Jules Hedges (Jun 05 2020 at 20:30):

Imagine rewriting TikZ from scratch, even its user manual is 900 pages long

Jules Hedges (Jun 05 2020 at 20:31):

And that's just one library

sarahzrf (Jun 05 2020 at 23:52):

Ben Steffan said:

Bold of you to assume people ever got over Lisp.

learn type theory, ben

sarahzrf (Jun 05 2020 at 23:53):

@Jules Hedges modal type theory is probably the right way to do macros

sarahzrf (Jun 05 2020 at 23:53):

ppl are working on it...

Joshua Meyers (Jun 06 2020 at 22:26):

Jules Hedges said:

"Making a better LaTeX" would be the side project to end all side projects. You could probably spend your whole career on it and still not make something that looks as good

Here's a side project that spent a whole lot of time to make LaTeX look slightly better: https://ctan.org/pkg/microtype?lang=en

Lê Thành Dũng (Tito) Nguyễn (Jun 06 2020 at 22:30):

Jules Hedges said:

"Making a better LaTeX" would be the side project to end all side projects. You could probably spend your whole career on it and still not make something that looks as good

It's been attempted by some people in the logic group at Chambéry, in OCaml:
https://github.com/patoline/patoline
I know that at least one PhD manuscript has been typeset in Patoline, but from what I understand you should personally know the developers if you want to do that :p
(Also, the website patoline.org is down, which is not a good sign for the health of the project…)

Eduardo Ochs (Jun 07 2020 at 00:45):

There is also ConTeXt...

https://tex.stackexchange.com/questions/4987/why-should-i-be-interested-in-context/5007#5007
https://tex.stackexchange.com/questions/141425/can-you-use-latex-packages-in-context-emulate-context-in-latex
http://www.pragma-ade.nl/overview.htm