You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
@Valeria de Paiva I mean the archive of the categories mailing list. Already the post-2009 mails are nearly lost: some of them are preserved in the Internet Archive's sporadic scrapes of the Gmane servers, but with no consistent url choice. It's not clear if all of them survived. The early ones are neatly curated, and everything from 1990–2009 is stored in big text files complete with email headers, one file per month.
There was concern at MathOverflow meta over old links to the Gmane domain, since that broke (and now seems only accessible using nntp protocol, and via a different TLD address). I easily found four published books on the first page of Google search with Gmane urls in references, pointing to specific emails on the list, including one published this year.
There is a lot of good historical and mathematical information from pioneers in the field on there, including observations not recorded anywhere else. So I hope someone with the skills can hook up to the Gmane servers and extract the last 10 or so years worth of posts (though of late, this list is functionally dead), if they are there, and then host the whole history somewhere accessible and searchable.
David Michael Roberts said:
Valeria de Paiva I mean the archive of the categories mailing list. Already the post-2009 mails are nearly lost: some of them are preserved in the Internet Archive's sporadic scrapes of the Gmane servers, but with no consistent url choice. It's not clear if all of them survived. The early ones are neatly curated, and everything from 1990–2009 is stored in big text files complete with email headers, one file per month.
Thanks for the explanation. I thought the back ups were ok before and never really looked. I was furious when Hypatia went offline some 20 years ago. It isn't even on Wikipedia, a shame.
I wish I knew someone which the ability to extract those emails from the Gmane servers, no one comes to mind at the moment. I totally agree that the work there should be preserved.
Ugh, this is a tragedy that could easily have been prevented.
John Baez said:
Ugh, this is a tragedy that could easily have been prevented.
Agreed!!!
I could only find a picture of Hypatia's opening page, attached here. I hope we can find someone to help with the categories mailing list from 2009, it's totally absurd to lose this stuff!
Hypatia-openingpage.tiff
Those with know-how should grab the files from here: https://www.mta.ca/~cat-dist/#archives and store them somewhere public as a backup.
Hypatia's long term archival will protect your work for future generations, (https://web.archive.org/web/19990222165443/http://hypatia.dcs.qmw.ac.uk/html/faq-mirror.html)
Oh the irony...
Sadly, because of the conservative robots.txt file, all the actual material from Hypatia has evaporated, leaving the Internet Archive with just front-facing material that doesn't link anywhere: https://web.archive.org/web/19990208004719/http://hypatia.dcs.qmw.ac.uk/
David Michael Roberts said:
Those with know-how should grab the files from here: https://www.mta.ca/~cat-dist/#archives and store them somewhere public as a backup.
I am working on it... will report back if I succeed...
On a related note, someone (me?) should make an index for the TWF's... At one point I ended up downloaded all of them so I could search through them for a specific thing I was looking for.
Let me know if you need technical assistance or a place to store the documents. Big thanks for your service @Simon Burton
I have reached out to the owner of gmane and asked if I can get a copy of the archives it hosts as well. If there's no response, a newsreader can always be configured to just suck everything down, I suppose...
Simon Burton said:
On a related note, someone (me?) should make an index for the TWF's... At one point I ended up downloaded all of them so I could search through them for a specific thing I was looking for.
I thought there was one already. But I can't remember where. It was some (?random) mathematical internet citizen.
David Michael Roberts said:
Those with know-how should grab the files from here: https://www.mta.ca/~cat-dist/#archives and store them somewhere public as a backup.
I got this to work: https://arrowtheory.com/mirror/www.mta.ca/cat-dist/
There's also a tarball if anyone else wants to host it: http://arrowtheory.com/mirror.tbz2
Simon Burton said:
On a related note, someone (me?) should make an index for the TWF's... At one point I ended up downloaded all of them so I could search through them for a specific thing I was looking for.
It would be great having an index of This Week's Finds. I'm going to start writing more polished articles on some of the recurrent themes, but that's a bit different. I also want to put This Week's Finds into a PDF file on the arXiv. Jason Erbele was starting to LaTeX them, but it's a big job and he soon gave up. I'll probably do something less difficult.
I guess you know there's a table of contents of the first 239 issues. I got tired at that point and quit.
I recommend that we host a CT torrent for publicly available publications. While I work at providing long term storage of text from wisdom traditions, I have the mind set and technology to assist any interested parties.
It might be easy to get the files at the top of this page:
https://www.mta.ca/~cat-dist/#archives
It says
There is an archive of postings at nntp://news.gmane.org/gmane.science.mathematics.categories maintained by Gmane. You may need a news-reader client to access it.
I'm too busy to install a news-reader and download these, but if someone does, they could put these files in a location that's 1) stable, 2) easier to access.
Daniel Geisler said:
I recommend that we host a CT torrent for publicly available publications. While I work at providing long term storage of text from wisdom traditions, I have the mind set and technology to assist any interested parties.
Well, actually there are "public" repositories that have pretty much any book/paper in maths you can think about. I won't state them explicitly here because I guess it's illegal, but suffices to say that there is this library that is called as the first book of the Bible... :slight_smile:
Maybe someone should assemble the old category theory mailing list postings and put them on libgen.
(I don't think there's a law against saying libgen.)
John Baez said:
Maybe someone should assemble the old category theory mailing list postings and put them on libgen.
(I don't think there's a law against saying libgen.)
I think this is the best option for making them universally available!
From what I can see, gmane is not working . I'm not sure if this is a temporary situation, or what..
Oh, so you tried accessing it with a newsreader?
John Baez said:
Maybe someone should assemble the old category theory mailing list postings and put them on libgen.
(I don't think there's a law against saying libgen.)
If you say libgen in the mirror in the dark 3 times, Elsevier will come knocking on your door.
:ogre:
Fabrizio Genovese said:
Daniel Geisler said:
I recommend that we host a CT torrent for publicly available publications. While I work at providing long term storage of text from wisdom traditions, I have the mind set and technology to assist any interested parties.
Well, actually there are "public" repositories that have pretty much any book/paper in maths you can think about. I won't state them explicitly here because I guess it's illegal, but suffices to say that there is this library that is called as the first book of the Bible... :)
well, they don't have what was in the old Hypatia, as they didn't exist then, I'm afraid.
@Simon Burton are you using the new domain? The one listed at the categories home page is outdated. See https://lars.ingebrigtsen.no/2020/01/15/news-gmane-org-is-now-news-gmane-io/
It should be nntp://news.gmane.io, not gmane.org
Valeria de Paiva said:
Fabrizio Genovese said:
Daniel Geisler said:
I recommend that we host a CT torrent for publicly available publications. While I work at providing long term storage of text from wisdom traditions, I have the mind set and technology to assist any interested parties.
Well, actually there are "public" repositories that have pretty much any book/paper in maths you can think about. I won't state them explicitly here because I guess it's illegal, but suffices to say that there is this library that is called as the first book of the Bible... :slight_smile:
well, they don't have what was in the old Hypatia, as they didn't exist then, I'm afraid.
...Which means that we should upload this stuff there! :D
Fabrizio Genovese said:
Valeria de Paiva said:
Fabrizio Genovese said:
Daniel Geisler said:
I recommend that we host a CT torrent for publicly available publications. While I work at providing long term storage of text from wisdom traditions, I have the mind set and technology to assist any interested parties.
Well, actually there are "public" repositories that have pretty much any book/paper in maths you can think about. I won't state them explicitly here because I guess it's illegal, but suffices to say that there is this library that is called as the first book of the Bible... :)
well, they don't have what was in the old Hypatia, as they didn't exist then, I'm afraid.
...Which means that we should upload this stuff there! :D
this thread is interesting at many levels:
** how to preserve network content. categories mailing list is an easy early web community product: you just need to find it and save it. but some people maybe talk about thinkgs worth remembering here on zulip. there are in the meantime many versions of "proprietary email" (as whit diffie calls them). some platforms for social, scientific, dating interactions are provided for free, and they "maintain free services" by collecting data for advertising, campaigning, credit rating. (can a theorem that you mentioned on facebook be used to predict your political afiliation?)
** note the name of hypatia. libraries are sometimes murdered for a purpose. we know 5 tragedies by aeschyllus, and i think about 80 titles, but there were allegedly 200 of them in the catalog. one more godless than the other. destroying the web as the tower of babel will undoubtedly become an attractive proposition. eg to disrupt a global conspiracy of category theorists. and it might be not as hard as it used to be. the original internet was designed to survive the fragmentation in case of a nuclear war. but a network resilient to fragmentation is suboptimal for monetizing...
** memory is based on forgetting (cf funes the memorious). if you record all that happens, it all becomes noise. how should the web select what to remember? ants amplify shorter paths using pheromons. dropbox can drop unused links...
QUESTIONS:
1) what might be a good architecture for a community to store old archives. should everyone donate a bit of memory, and someone writes a simple private cloud module? also multiparty?
2) how should a network version of long term memory be managed?
thoughts?
how to preserve network content.
See for instance the evaporation of Google+. For example: I'm glad Lieven le Bruyn saved his posts working through the complete details of a Frobenioid and re-hosted them on his blog, but this is merely one example among many excellent mathematical discussions that are now either gone, or saved in a zip file on someone's personal computer, if they were diligent in saving their timeline before the end.
I recommend we focus on mathematics and not reinvent technology. I'm looking at creating a CT torrent. N'uff said?
archive.org, arxiv, libgen, ipfs, dat, bittorrent, upspin and perkeep (not sure how good these are at distributed access)
torrents have a huge problem in that they are fixed file trees, and making a torrent for every paper would be heavy on metadata and likelihood of availability, usually stopgapped by periodically forming aggregated chunks. ipfs/infs and dat both have granular addressing and versioning.
programmers have been working on the necessary technology for a while. the other problem, which comes down to resources, is robust hosting, both the content itself and whatever indexes are useful to organize search and access.
IPFS is probably the most resilient way to store content right now, but I don't know how practical it is for a paper/math repo
What was Jason Erbele's progress on the LaTeX document? I would be willing to contribute to this effort as I had recently discovered TWF's.
have*
@Grant B - umm, he did the first 5 or so. You can contact him at
When he did this I was swamped with other work and not able to give him much help.
Fabrizio Genovese said:
IPFS is probably the most resilient way to store content right now, but I don't know how practical it is for a paper/math repo
great, yes, the functions of IPFS are definitely needed. but as far as i can tell, IPFS seems distributed and anonymous, but is it really resilient? could i not seed it with malware from within that would, say, flood and overload all nodes? can it be resilient without any form of reputation or authentication? but yes, a persistent memory will have to be some sort of ledger. is that an interesting question to pursue? seems like a question thst naturally leads into categorical crypto :) as a security proof would need to be at 3 levels at least
Pastel Raschke said:
archive.org, arxiv, libgen, ipfs, dat, bittorrent, upspin and perkeep (not sure how good these are at distributed access)
torrents have a huge problem in that they are fixed file trees, and making a torrent for every paper would be heavy on metadata and likelihood of availability, usually stopgapped by periodically forming aggregated chunks. ipfs/infs and dat both have granular addressing and versioning.
programmers have been working on the necessary technology for a while. the other problem, which comes down to resources, is robust hosting, both the content itself and whatever indexes are useful to organize search and access.
very good! so i ask myself: is the problem not already solved by archive.org, arxiv and libgen? i use all 3 every day. why do i then ask this question? well, in theory at least, each of them could be sold to elsevier. they are not distributed. but it also gives rise to a more serious question. is there a solution that will not take into account the incentives? who has an incentive to maintain public goods. ((BTW, for that reason the projects that are concerned with privacy seem to be going in a different direction.))
@dusko said:
who has an incentive to maintain public goods.
That is the social service project I've taken upon myself, although from the comments on this thread I see I need to up my game. I help several different groups of yogis preserve their work. I live in Eugene, Oregon in the North West US. This is one of the richest bioregions in the world, and as a result it supports more endangered native languages than anywhere in the world. Lots of potential good technical service projects.
By the way, I hope everyone here knows this github site:
It's very good but it's not complete - it lists more works that it actually has. So, anyone who has access to these works should contribute them!
dusko said:
Fabrizio Genovese said:
IPFS is probably the most resilient way to store content right now, but I don't know how practical it is for a paper/math repo
great, yes, the functions of IPFS are definitely needed. but as far as i can tell, IPFS seems distributed and anonymous, but is it really resilient? could i not seed it with malware from within that would, say, flood and overload all nodes? can it be resilient without any form of reputation or authentication? but yes, a persistent memory will have to be some sort of ledger. is that an interesting question to pursue? seems like a question thst naturally leads into categorical crypto :slight_smile: as a security proof would need to be at 3 levels at least
For sure you can host malware-laced files on IPFS, exactly as you can do on bittorrent or with the PDFs you host on github. For me "resilient", in this particular case, means "decentralized". Which means that as long as at least one person seeds the files, it cannot be shut down. The issue of hosting community, "public" files on platforms as arXiv and github is that they are owned by someone, be it a university or a private company. And this someone can decide to shut everything down and leave everyone hanging.
You may say "well, arXiv won't shut down all of a sudden destroying all of its logs forever" and I agree, but still, I prefer to use something that can mathematically guarantee me that this does not happen instead of having to trust some institution. So for the real "public" stuff (as in "owned by the community") p2p file networks as IPFS are the only reasonable and "ethical" way to go.
Fabrizio Genovese said:
You may say "well, arXiv won't shut down all of a sudden destroying all of its logs forever" and I agree, but still, I prefer to use something that can mathematically guarantee me that this does not happen instead of having to trust some institution. So for the real "public" stuff (as in "owned by the community") p2p file networks as IPFS are the only reasonable and "ethical" way to go.
i just said above that arxiv and archive are not a solution because they can be sold to elsevier. and i also appreciate the information that you can always store PDFs on github. but the IPFS design assumption, that there is no need for trust because there is no trusted 3rd party, is very naive. resilient could only mean decentralized if attacks had to be centralized. are you in contact with the IPFS designers?
Well, what I mean is that, for instance, GitHub does not really test for malware in the repos you host. So from the point of view of using a repo as a possible attack vector, I do not see IPFS as less secure than other solutions.
Can you give me an example of an attack using the decentralized nature of IPFS to succeed? I'm not sure I'm really understanding what you mean by "attack" here
Yes, we vaguely know them, but there are other people, as @davidad (David Dalrymple), which know them way better
John Baez said:
Grant B - umm, he did the first 5 or so. You can contact him at
When he did this I was swamped with other work and not able to give him much help.
Thank you, I reached out to him. I will keep this thread posted on any progress I make.
Grant B said:
John Baez said:
Grant B - umm, he did the first 5 or so. You can contact him at
When he did this I was swamped with other work and not able to give him much help.
Thank you, I reached out to him. I will keep this thread posted on any progress I make.
The immediate progress is that Grant's reaching out to me finally spurred me to figuratively get off my butt and get on zulip. I'd say the first three TWFs are done, and two more are "basically" done, only needing ASCII graphics to be redrawn in TikZ. So "the first 5 or so" was actually remarkably accurate. I did a bit of Spring cleaning, so the Overleaf project I started for TWF is a little bit more organized than it was a week ago, but it'll probably be later this month when I'll have time to make another appreciable dent.
@Jason Erbele Would you like to put your work on github ? It seems like this project could be worked on collaboratively...
@Simon Burton
I don't have any objections, per se, but I don't see what the advantage of github would be. And a major obstacle would be that I don't know how to use github. The project is on Overleaf, which is a collaborative platform with version control already, plus it has the LaTeX compiler built in. I can supply a link that anyone can use to join and edit.
But first, I need to get some sleep – it's already 7am here. :grimacing:
Hi, Jason! It's great to see you here and I hope you and your family are doing okay.
If you remind me of the Overleaf link I can put the first few remastered TWFs on my website!
(Overleaf can be used as a git repo, for the record)
Link to TWF on Overleaf, read-only: https://www.overleaf.com/read/sspmkpykyvhr
Editable: https://www.overleaf.com/5857655535xtmhnqbvvkjr
Thanks!
John Baez said:
I'm too busy to install a news-reader and download these, but if someone does, they could put these files in a location that's 1) stable, 2) easier to access.
I put an explanation of how to access it with Emacs here: https://meta.mathoverflow.net/a/4583
I think someone who knows how to program in Emacs Lisp could download a full copy, but I think it's illegal by the letter of the law (unless you asked each and every poster individually for their copyright). So it would have to "fall off a lorry/truck".
I think storing these things legally in the US would involve one of these "safe harbour" provisions: https://en.wikipedia.org/w/index.php?title=Online_Copyright_Infringement_Liability_Limitation_Act&oldid=954682139#Safe_harbor_provision_for_online_storage_-_%C2%A7_512(c)
In my gmail account, I have the categories mailing list archive since 1/1/06. I can't think of any reason I would have deleted any particular messages, but I guess it's possible I'm missing a few. I can export the whole thing via IMAP. Does anyone have a place to host it?
Oh, looks like someone else did the same thing and put it here: http://arrowtheory.com/mirror/categories.tgz
Mike Stay said:
Oh, looks like someone else did the same thing and put it here: http://arrowtheory.com/mirror/categories.tgz
Yes @Mike, @Simon Burton has gotten both the archives online on GitHub https://github.com/punkdit/categories yay!!!
Yay, @Simon Burton!
The trick is now trying to extract the headers of each mail and make an index, maybe even a threading system, like at fom (eg https://cs.nyu.edu/pipermail/fom/2020-August/)
@David Michael Roberts Right... The github repo is pretty much unusable (not user friendly) at the moment... I wonder if there is some kind of python library for doing something like this (parsing headers & generating a threaded index). Also, I'm wondering if/when google will see these messages..
This open-access book has some python scripts that might be a useful starting point: http://www.opentextbooks.org.hk/zh-hant/ditatopic/6826