You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
NLab is a great resource, but as everything can be improved, I thought it would be good to have a thread on it.
Two ideas to start:
A lot of links are dead on nlab. It would be useful to have a crawler verify links regularly. Indeed it would be useful for any object linked to, to be saved first, and in case a link breaks the cached version could be linked to.
This is a general feature that any web server should have, as I argue at the end of the blog post co-immunology and the Web
On another thread a couple of days ago, it occurred to me that it would be useful to have a map of how different CT concepts have ended up being used in the world. We found that petri-nets were used in Scala which is very helpful to know.
You may want to cc the Steering Committee, although idea 1 sounds really technical and I have no idea who the sysadmin is now. It was Andrew Stacey, but I'm not sure if it's still him now.
Idea 2 is a nice idea, but would require some visualisation, and I don't know how much Instiki is capable of doing.
I guess if this were added on the wiki in a way so as to make the data extractable people could slowly add such information to each page, and then later extract the data when enough was available to build a visualization.
It's just markdown, so adding a section (with a standardised name) for real-world applications would probably do the trick.
The sysadmin is Richard Williamson. I'm on the Steering Committee and can bring it up.
Yeah, I know David, but he's pretty busy these days.
I think anyone who wants to start adding application sections would be welcome to do so.
So @Henry Story , I did hear back from two other Steering Committee members, and there is general openness to the crawler idea, but I don't know anything about this, such as how to even get started. Do you have any experience with getting such a crawler installed?
I am in the middle starting a company to get a part time job to pay for my PhD. But it would be relatively easy to write one in Scala (for me) using akka. Nevertheless like any software project it does require work - writing code, testing, ... I think I need it for the project I am working on (hence my blog post). So at some point I'll be writing one. It would be then some work (but less) to adapt it to the nlab.
I'm technically an 'assistant system administrator' for the nLab; I say 'technically' because I only came on board at the end of last year, and haven't yet contributed much (for reasons I'll explain below). But I can at least offer some initial thoughts. :-)
Richard Williamson is the current sysadmin for the nLab. He's currently working hard on a rewrite of the nForum, part of which involves writing a new Markdown parser. The current priority is to move the nLab to the cloud, and Richard feels this rewrite is a blocker for that. Once the rewrite is done, however, the move to the cloud will present opportunities for people to help out.
As a result of the preceding, Richard currently doesn't have much time available to onboard people wishing to help out with nLab stuff. There are several small changes I started working on, but I reached a point where I needed further assistance from Richard, and his time constraints have meant he's been unable to do so. So I've not done much more than occasionally dealing with nLab spam.
From my perspective, the nLab software needs administrative re-organisation to make it easier for others to contribute. For example, development is effectively not under version control - the GitHub repos are not actively used (via branches, GitHub forks, issues, pull requests etc.), but are essentially snapshot mirrors.
One other thing I'd like to mention is that nLab currently makes use of Ruby (via Instiki and Ruby on Rails) and Python, and I seem to remember Richard's new parser might be in another language again, maybe C? My personal preference is that we try to avoid adding further languages to the mix, as the more languages involved, the more complex the entire system will be to maintain.
Alexis Hazell said:
One other thing I'd like to mention is that nLab currently makes use of Ruby (via Instiki and Ruby on Rails) and Python, and I seem to remember Richard's new parser might be in another language again, maybe C?
From a certain point of view, this is :scream:.
if you have a point of view from which it's not :scream:, i would be morbidly curious to know what it is
i just wanna know why the search is so slow
um
this better not be doing what i think it's doing
What do you think it's doing?
I don't even know what language that is, but the nLab search is really slow so if you think they've made a horrific mistake then I believe you
it's loading a list of every page from the database and then searching through them in the web app
instead of letting the database do the searching
unless i've misunderstood the code, which is possible—but i did poke around a bit to make moderately sure
sorry, not just a list of every page, but the content of every page, so that it can scan through all of it
It's Ruby - more specifically, Ruby on Rails.
(rails is a framework, the language is still ruby)
Yes, I'm aware. :-)
sorry :zip_it:
I see. So I'm downloading a copy of the entire nLab (minus pictures) every time I do a search :scream: (EDIT: No)
I don't know what RoR does behind the scenes with this, as I only started learning it last year when I came on board, then haven't looked into it any further for the reasons I described above, plus all the other commitments in my life.
well, you aren't, but the ruby process is downloading a copy from the database every time, as far as i can tell
RoR might translate such code directly into db queries for all I know.
and they're probably either on the same machine or on a very fast network with each other, so that's waaaaaaaay faster than downloading the whole nlab over the internet
nah alexis i poked around in the code to make sure
im pretty confident this is what it's doing
one sec
They're on the same server.
ahh, you're one of the admins :o
Well, as I wrote above, 'technically'. :-)
I don't know how much the hardware underlying the current server is impacting performance, but since Richard is prioritising getting things ready for a move to cloud servers, I suspect it might be.
so the select
method being called on @web
is, i am pretty willing to guess from strong context clues, this:
image.png
...not that name resolution in ruby is actually possible to do statically, but we try.
then: image.png
so when you say @web.select { |page| ... }
, that ends up just @web.pages.select {|page| ... }
and tossing them into this PageSet
wrapper class instead of an Array
roughly speaking
and this is where .pages
comes from on @web
:
image.png
browsing the activerecord docu makes it look like methods generated by has_many
will produce some kind of instance of this https://api.rubyonrails.org/classes/ActiveRecord/Relation.html
and that folds in Enumerable
, so select
is gonna enumerate the whole thing
its possible ive confused myself and this is actually a https://api.rubyonrails.org/classes/ActiveRecord/Result.html but that's also folding in Enumerable
so either way, that select
is happening on the ruby end
EDIT: i realized later the select
is actually probably coming from https://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#method-i-select (which that documentation completely fails to show, as ironic punishment for me because i mocked the impossibility of statically resolving names in ruby above)
but that does in fact appear to delegate back to Enumerable
when called with a block, so. same conclusion
which means the only way it can be getting all the stuff it needs to be looking at is if @web.pages
really is just... grabbing the entire site
obviously there are any number of places where i couldve missed something here, but it seems to fit together as far as i can tell, and it would certainly explain why the search function is so incredibly slow
but it's 5:15am here and hooooly crap i should be asleep :sob:
:-) Your analysis certainly seems reasonable but to me, but yeah, I only have a passing familiarity the system (and RoR) at this point. But it wouldn't surprise me if the nLab hadn't significantly outgrown the sort of use-case that RoR might have been designed for.
oh that has nothing to do with it :p
you can just ask the db to match the regex instead of ruby
you just need to fiddle the code around a little so that it's constructing a query that passes the regex to the db, instead of doing a full-site query and then filtering it ruby-side
ruby's regex engine probably has more features than the db's, but i feel like that's a small thing to lose
im going to close this tab for now before i get sucked into a convo about this
i need. sleep
Fair enough. :-) But for when you come back later: could perhaps mention all this in the "Bugs and feature requests" discussion in the nForum? That way this issue is less likely to be forgotten. (Oh how I wish we were using an actual issue tracker ....)
More generally, people might like to check out the "Technical TODO list", which describes a number of things that need to be addressed. :-)
sarahzrf said:
if you have a point of view from which it's not :scream:, i would be morbidly curious to know what it is
From the point of view of "C is faster than Ruby and Python, so we should use C".
Well done on the code review, by the way!
Well, I don't remember the exact language Richard is writing the new parser in (it might not have been C, but Go, or something else entirely), but at any rate, his work is not, afaik, intended to address nLab search speed .... At any rate, I've just sent him an invite to join this discussion.
Alexis Hazell said:
Well, I don't remember the exact language Richard is writing the new parser in (it might not have been C, but Go, or something else entirely), but at any rate, his work is not, afaik, intended to address nLab search speed
Sorry, it was just a guess. But I know there's a point of view that thinks that Python and Ruby cannot possibly be faster than lower-level languages like C or Go, and since the slow search came up in this convo as an issue, that became my hunch. I'm sure Richard has his own reasons for choosing a new language, and I'm hoping his experience in industry is a sign of good things to come.
Alexis Hazell said:
He's currently working hard on a rewrite of the nForum, part of which involves writing a new Markdown parser. The current priority is to move the nLab to the cloud, and Richard feels this rewrite is a blocker for that. [...]
One other thing I'd like to mention is that nLab currently makes use of Ruby (via Instiki and Ruby on Rails) and Python, and I seem to remember Richard's new parser might be in another language again, maybe C?
So here's something that's confusing me. There are a few open-source markdown parsers written in Ruby or Python out there, some of which are allegedly fast. Since this rewrite has been such a blocker for moving the nLab to the cloud, what have been the difficulties in re-purposing one of these pre-existing parsers for the nForum? Or was that what Richard's been trying to do?
Sorry, I don't know the details, apart from the parser being only one aspect of an overall rewrite of the nForum code (currently in PHP), which Richard has said is what's needed for the move. Hopefully Richard will join this discussion and fill in those details. :-)
By the way, @David Tanzer has set up a wiki sort of like the nLab and very like the Azimuth wiki - but for the ACT community.
He'll open it up to us pretty soon.
I hope we use it, and I hope some of you programmers work on making it great!
@John Baez Indeed! Once it's launched, we can use a separate stream for planning and detailed discussions about content organization and page contents. Hope to see people there!
Amazing! I talked about this a lot a year or 2 ago, and then dropped it because noone else seemed to care as much as me
It's a kind of coincidence, @Jules Hedges - David was trying to get the Azimuth Wiki to use TikZ, and it turned out to be easier to set up a new wiki that's similar in format to the Azimuth Wiki but uses TikZ.
Yay! I'll definitely be on board with this
I think we could both put content on the nLab, and in new wikis, provided that we give links to one another when it's relevant, so that we don't have many small projects. After all, as applied category theorists we know very well that networks can be glued :)
Guys, if you are going to write an nLab web site, do it in an advanced typesafe language! Something where you can learn/use Category Theory while programming! Not Ruby.
:duck:
Alexis Hazell said:
Sorry, I don't know the details, apart from the parser being only one aspect of an overall rewrite of the nForum code (currently in PHP), which Richard has said is what's needed for the move. Hopefully Richard will join this discussion and fill in those details. :-)
Sorry to put you on the spot, it's just that the question occurred to me as I was reviewing this thread. Yeah, let's wait for him to join this.
Rongmin Lu said:
Well done on the code review, by the way!
little can motivate me like the conviction that i have found a mistake in someone else's code :upside_down:
sarahzrf said:
little can motivate me like the conviction that i have found a mistake in someone else's code :upside_down:
Ah yes, the "somebody on the internet is wrong" effect. :upside_down:
John Baez said:
It's a kind of coincidence, Jules Hedges - David was trying to get the Azimuth Wiki to use TikZ, and it turned out to be easier to set up a new wiki that's similar in format to the Azimuth Wiki but uses TikZ.
i wonder if it would be possible to get tikzit running on emscripten and integrate that with such a wiki somehow :thinking:
i think that might make a really nice ui
otoh, tikzit isn't huge, so it might be more expedient to just try to clone it for the web directly
...is what i would say if i had momentarily forgotten how easy it is to underestimate how much work a given thing would be to develop.
(:
sarahzrf said:
i wonder if it would be possible to get tikzit running on emscripten and integrate that with such a wiki somehow :thinking:
Stole my idea.... I didn't suggest it now someone is doing actual work, since I assumed it would be really hard
For writing it makes little difference, if it supports TikZ then you can use TikZit offline and just copy the compiled code in. Integrated TikZit would be really amazing for collaborative editing of diagrams
sarahzrf said:
i wonder if it would be possible to get tikzit running on emscripten and integrate that with such a wiki somehow :thinking:
Not emscripten, but not that far off either using web-assembly: http://tikzjax.com/
wait, tex is written in pascal? image.png
TIL
Jules Hedges said:
For writing it makes little difference, if it supports TikZ then you can use TikZit offline and just copy the compiled code in. Integrated TikZit would be really amazing for collaborative editing of diagrams
hmm, remind me, to what extent can tikzit open tikz code? like, doesn't it save its diagrams as basically just tikz?
I think it has its own file format (but I don't use it, sad laptop)
sarahzrf said:
wait, tex is written in pascal? image.png
this is one of the most nightmarish sentences i've read
Matteo Capucci said:
sarahzrf said:
wait, tex is written in pascal? image.png
this is one of the most nightmarish sentences i've read
oh buckle in then, because that's actually not true
TeX is written in WEB
Which then compiles to Pascal
If you're now wondering what WEB is: it's a programming language invented by Knuth himself, and he's basically the only one to ever write programs in it
how does this improve the situation
Well, basically WEB is an experiment in literate programming: you essentially write documentation that has the code of your actual program interspersed
WEB then compiles the documentation to TeX and the actual code to Pascal
All of this is ancient in terms of the age of computers, of course.
This implementation was called "WEB" by Knuth since he believed that it was one of the few three-letter words of English that hadn't already been applied to computing.
This is a quote from the wiki page.
Oh I see. Cool then!
What does everyone have against Pascal? It was state of the art when it was created, it had the wisdom of Dijkstra built in
Jules Hedges said:
What does everyone have against Pascal? It was state of the art when it was created, it had the wisdom of Dijkstra built in
the use of past tense in your phrase looks extremely relevant to the overall reaction of people
anyway TeX is one of the greatest tool I'm using, but I think it's somewhat poor from the PL perspective: I never understood how to write things in a modular way, like local commands for instance; even packages break each other and my usual solution is to look up on TeX-exchange some incantation magically putting them in the right order...
In my opinion TeX is a failed experiment in language design, it demonstrates quite well that building a language with macroexpansion as its model of computation is a nightmare in practice
Specifically, we accept now that having errors is the default path during compilation, whereas a successful compilation is the exceptional path. And macroexpansion based computation is really, really bad for reporting errors
I guess it comes from a time when everyone was still in love with Gödel and Quine and Lisp, and doing weird shenanigans with syntax. Eventually everyone got over that and every reasonable language now respects lexical transparency
Bold of you to assume people ever got over Lisp.
Also TeX is clearly not failed in any practical sense; on the contrary, it's obviously hugely successful :)
Yep, Cobol is successful too
Not really
Cobol is basically dead. At least I would be very surprised if there was anybody who developed new software in Cobol. Its only use is keeping software alive that was written a long time ago.
Ok fair, Fortran is a better comparison, lots and lots of new software is written in Fortran because it has the best libraries for one important job
Yes, agree
I will probably use LaTeX until I die, having invested so much time in learning it. But younger folks should invent something better.
Literally, I will probably die while debugging some LaTeX.
"Making a better LaTeX" would be the side project to end all side projects. You could probably spend your whole career on it and still not make something that looks as good
There's LuaLaTeX - it's quite easy to use it to implement new little languages for drawing diagrams, for example.
Jules Hedges said:
"Making a better LaTeX" would be the side project to end all side projects. You could probably spend your whole career on it and still not make something that looks as good
I feel that something like is either built impromptu within a weekend or never.
Javascript has a similar story, it was developed without much thinking over 10 days
Sure the core programming language could be hacked together in a weekend and be immediately a big improvement, but the typesetting algorithms would be years of work, and getting enough libraries to actually write papers would probably take decades for a team of 1
Imagine rewriting TikZ from scratch, even its user manual is 900 pages long
And that's just one library
Ben Steffan said:
Bold of you to assume people ever got over Lisp.
learn type theory, ben
@Jules Hedges modal type theory is probably the right way to do macros
ppl are working on it...
Jules Hedges said:
"Making a better LaTeX" would be the side project to end all side projects. You could probably spend your whole career on it and still not make something that looks as good
Here's a side project that spent a whole lot of time to make LaTeX look slightly better: https://ctan.org/pkg/microtype?lang=en
Jules Hedges said:
"Making a better LaTeX" would be the side project to end all side projects. You could probably spend your whole career on it and still not make something that looks as good
It's been attempted by some people in the logic group at Chambéry, in OCaml:
https://github.com/patoline/patoline
I know that at least one PhD manuscript has been typeset in Patoline, but from what I understand you should personally know the developers if you want to do that :p
(Also, the website patoline.org is down, which is not a good sign for the health of the project…)
There is also ConTeXt...
https://tex.stackexchange.com/questions/4987/why-should-i-be-interested-in-context/5007#5007
https://tex.stackexchange.com/questions/141425/can-you-use-latex-packages-in-context-emulate-context-in-latex
http://www.pragma-ade.nl/overview.htm