You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Having just uploaded my paper on topological monoid actions to the arXiv (which I will post about on #practice: our papers once it's online tomorrow), now seems like a good time to take stock on what I'm up to.
Beyond the end of my PhD... It's clear that some of the work above is going to have to take place after my PhD ends. If you happen to be or know someone who can offer me a post-doc position where any of the above would be relevant, then please get in touch :stuck_out_tongue_wink: .
This stream has a lot of big names posting in it, so I figure it would be nice thesis-writing procrastination to update my own thread, since I'm just a student. Here's what I've got on in the near future:
It's a bit shorter than other folks' entries; I hope that's relatable.
Looks very similar to my to do for most of last year :)
Update: thesis done! I had a breakthrough in the last month that I couldn't resist squeezing in. It's a relief to have submitted it.
I defended my thesis last Tuesday. Should be up on the arXiv today.
I'm a doctor now yay!
Congratulations! :tada:
Congratulations!
One small note @Morgan Rogers (he/him) : when I tried to get the pdf of your thesis from the arXiv, it failed! I got the postscript ok, though.
Hmm thanks for letting me know @David Michael Roberts, I'll check it later.
I have the same issue, but also the ps fails for me.
The pdf from the arXiv worked for me now. Congratulations! (BTW, you could certainly also cross-list to math.LO.)
Congratulations @Morgan Rogers (he/him) ! The PDF is now working for me as well, after it initially hadn't this morning
Those of you that have spoken to me in the past 6 months or more probably know that I am doing a research project on AI Safety. I figure I might as well share what I've done so far here, in case there is some interest. The aim of the project is to provide some formal grounding for the intuitive concept of goal-directedness, a notion which appears in many stories about how sufficiently capable AI could go bad.
In my preliminary post I discuss the ideas I had about this before the project began.
In my first real post I talked about criteria for judging explanations, since this is a tool I want to apply to assessing goal-directedness.
In my second post I break down the structure of explanations with the intent of giving more flexibility to what counts as goal-directed behaviour.
Category theory gets a mention in the last post, although only incidentally (because I happen to include the category of world models that one could use as a parameter).
The third post in my AI project is finally up.
It's all about complexity. The aim of the post was to build an inclusive, flexible, semi-formal notion of complexity, which includes the ideas that you would find in any branch of complexity theory.
I did try to make it accessible to a wide audience, but I couldn't resist putting some pretty abstract maths in there. I'm curious to know what people reading from here get out of it. Enjoy :grinning_face_with_smiling_eyes:
Even though "computation" is in the name, the mathematically interesting details of computational complexity are considered to be those which are independent of specific implementation. For this reason, it is typical to consider an implementation-independent version
Indeed, I always found very peculiar that the notion of computational complexity one traditionally uses to study algorithms is given in the context of Turing machines. We have widly different models of computations (Turing, partial functions, lambda calculus) that have been proved to be able to compute the same stuff. Yet, in defining complexity theory we settled on one choice, in a way that's not invariant
I believe a computation takes steps to do in one of these models iff it does in any other, with independent of the model.
I could be wrong, but that's the sort of invariance I'd expect, and that's enough to justify why people say it takes time to do some sort of computation, or things like that.
John Baez said:
I believe a computation takes steps to do in one of these models iff it does in any other, with independent of the model.
The translation is sometimes quadratic rather than linear, so if it's in one then it will be in the others. To get a model-invariant complexity class we therefore have to pass from specific index to "polynomial time". I also give other problems which the coarser complexity classes avoid.
There is some intuition for this: if I have a specific computational task in mind, I might be able to design a machine that is particularly efficient at it while still being able to perform other computations. My advisor gave me an example of a machine that can perform computations on graphs efficiently; more generally, the speed at which a computer can perform certain computations on some data structure depends on how that structure is loaded into the machine, and encoding a graph on the tape of a Turing machine is hard to do efficiently. The claim is that the improvement in efficiency will only be polynomial.
Of course, for this to be true one also needs to control which operations count as basic. Beta-reductions in lambda calculus are a bit too powerful, for instance.
Thanks. This is interesting because many algorithms on Wikipedia come with explicit statements that they have complexity of the form with a specific exponent . For example:
https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm
So, it looks like they have an accepted way of modeling the computations, to say more than just "polynomial time".
(In my example is the number of vertices of a graph; we sometimes have a more complicated version of complexity that depends on several parameters like the number of edges and the number of vertices, but my point is about something else.)
There has been a lot of work recently on mechanizing synthetic computability theory, mostly centered around the https://github.com/uds-psl/coq-library-undecidability repo, so maybe we'll eventually get a nicer form of complexity theory too.
They already seem to have some complexity stuff built on top of it, e.g. https://drops.dagstuhl.de/opus/volltexte/2021/13915/
John Baez said:
So, it looks like they have an accepted way of modeling the computations, to say more than just "polynomial time".
I would say that these complexity statements are based on a "model-free" set-up rather than a "model-invariant" one: the complexity of an algorithm is measured in terms of the number of basic operations involved, without actually making reference to any particular implementation (where said operations might not be basic at all!) and the complexity of the problem is the complexity of the least complex algorithm. I hope that idea also comes across in what I wrote..!
Alex Gryzlov said:
There has been a lot of work recently on mechanizing synthetic computability theory, mostly centered around the https://github.com/uds-psl/coq-library-undecidability repo, so maybe we'll eventually get a nicer form of complexity theory too.
What do you mean by "a nicer form of complexity theory"? I did see a talk recently (at the last Chocola meeting in Lyon) about synthetic computability theory, where the focus was around axiomatizing Church-Turing in order to be able to formalize proofs involving it; it was possibly the most convincing use-case for formalization that I've seen.
Morgan Rogers (he/him) said:
John Baez said:
So, it looks like they have an accepted way of modeling the computations, to say more than just "polynomial time".
I would say that these complexity statements are based on a "model-free" set-up rather than a "model-invariant" one: the complexity of an algorithm is measured in terms of the number of basic operations involved
It is "model-free" in that we are abstracting from the cost of basic operations (which are all counted as ), but it does depend on an important, model-specific assumption: that memory is random-access. In other words, in the practice of algorithms, people have taken as default the RAM (Random Access Machine) model of computation. When people say, without any further comment, that "this algorithm has complexity ", they are thinking of an implementation on a RAM-like model. This is justified by the fact that, from the very early days, actual computers are RAM-like (much more than they are Turing-machine-like).
For example, the Floyd-Warshall algorithm does not have complexity on a (multitape) Turing machine, but that's not because the basic semiring operations in question cannot be implemented in constant time on a Turing machine (in some important cases, they can: for instance, for the Boolean semiring , which is all we need if we want to compute transitive closure rather than shortest paths). Rather, it is because Turing machines are not random-access, and reading an entry in the adjacency matrix of the input graph spread out on the input tape is not a constant-time operation. And even if you considered random-access Turing machines, you would still have logarithmic factors showing up because manipulating the pointers to the input tape does not have constant cost (the pointers are themselves tapes).
So, in reality, a better interpretation of the unadorned sentence "this algorithm runs in time " is "this algorithm runs in time on a RAM". Or, in more practical terms, "when all basic data (typically, integers) fit in less than bits (where is some technology-related constant; e.g., these days, ), this algorithm runs in time on an actual computer".
Moral of the story: the complexity analysis of algorithms is much less abstract and model-free than what we would like it to be! But that does not affect the meaningfulness of "robust" complexity classes (like logspace, polytime, deterministic or not, etc.), which are model-invariant.
Not an expert either, but my feeling is that there are two distinct communities talking about complexity:
in complexity theory, people are interested in complexity classes (polynomial, linear, etc),
in algorithms, they are interested in precise bounds (, etc).
(I'd be very interested in any categorical, model-independent, invariant, ... approach to both! Ping @Damiano Mazza...)
Yes, that's absolutely true. And in the first community, the model of choice is Turing machines, whereas in the second community, it is RAMs.
However, as I said, most complexity classes of interest in complexity theory are model-invariant, or, even better, have equivalent definitions which do not mention any model at all (see for example descriptive complexity theory). Turing machines are just a convenient choice motivated by the fact that their time and space cost models are self-evident and essentially unquestionable, but they are not inevitable for formulating complexity theory. Here, categorical approaches are definitely possible and, hopefully, relevant (I hope to be able to say something more precise soon!).
In the case of algorithms, there is no hope: precise running times (e.g., up to the exponent in a polynomial bound) depend on the model. We are especially interested in how algorithms behave on the computational devices we use every day, and so we tend to pick models that are close to them. In any case, it is possible to develop categorical approaches to both the languages describing algorithms and the relationship that these have with the cost models of interest. The first aspect is, basically, modern programming languages theory :upside_down: (and you know more than me about it!). The second aspect is less developed, and certainly more of a present-day challenge, but there is already interesting work on the topic. A couple of examples that come to my mind now are Blelloch and Harper's approach to time and space cost models in functional programming languages (although it does not explicitly use any category theory) and Danner and Licata's papers on denotational cost semantics (which uses monads!). But there's certainly more out there.
Thanks @Damiano Mazza , this clarifies a lot. So there is some sanity in the basic definitions of complexity theory, I'm very happy to hear that lol
There is also "fine-grained complexity theory" which tries to prove lower bounds on the k in O(n^k), with many interesting results (conditional on conjectures such as the exponential time hypothesis for SAT). Also, there's an old theorem that says that for any k, l >= 2, k-tape and l-tape Turing machines can simulate each other with a logarithmic overhead in time. So I'd say that even from a complexity-theoretic POV, "time O(n^k * polylog(n))" is a reasonably mathematically robust notion.
(That said, I've never really understood what is being really achieved when one shows that a log factor can be brought down to inverse-ackermann in some algorithm…)
Relatedly, I'd be interested in what other computer scientists think of this series of blog posts geared towards programmers: https://www.ilikebigbits.com/2014_04_21_myth_of_ram_1.html (personally, I'm not really convinced)
The black hole analysis is bogus and the empirical results don't correspond to the theory he presents (where the speed limit of information transfer is what creates the bound), but a bound in general and a bound for a planar architecture seems reasonable.
Ultimately there are yet further operations one could penalize in the complexity calculation by examining finer and finer grains of logistical detail in the operation of a particular model of computation. This is the kind of assumption it is helpful to be conscious of when doing complexity theory, I think: any amount of abstraction amounts to taking some operations for granted, treating them as cost-free. Acknowledging this maintains the focus on what a given notion of complexity is "actually" measuring.
It turned out that my post on complexity wasn't quite sufficient for the complexity measures I needed to build for goal-directedness, so I wrote an extension of it in a fourth post. This one is about relative complexity. It's not especially comprehensive; it's more of a record of me working out what relativisation can mean and do.
Thanks for sharing these posts! I would like to try and find time and energy to read them properly, as mathematical notions of complexity are something new and exciting to me. However, in an medical imaging context, I have worked a little bit with something called "sparsity", which is something I've thought of as measuring the simplicity of something.
In a vector space , if we choose a basis so that we can write a vector as a sum of non-zero basis vectors, we say that is -sparse with respect to . (Hopefully I am remembering this right). How sparse a vector is depends on the basis that we choose, which reminds me of your section entitled "fragility of complexity measures".
We can use sparsity to help solve optimization problems, where the intuition is that the real solution is "simple", which we approximate by saying it should be pretty sparse with respect to a chosen basis. A challenge here is choosing a basis that makes the unknown solution actually sparse. It's my guess that people are using machine learning these days to find sparsifying basis vectors. Anyways, this approach can help reject "artifacts" in images, and reduce the impact of noise - intuitively because these additions to an image make them less simple.
Do you happen to know if the sparsity of a vector with respect to a basis is somehow related to the kind of complexity you are talking about?
Edit: Upon reading a little more carefully, I see you talk about the squared length of vectors in a given basis as an example of how we can vary complexity by changing the building blocks involved. The example of sparsity is a little different, but very similar, so I think I understand that sparsity probably does fit under the umbrella of the things you're talking about here. Cool!
Yes indeed! It's really interesting to hear someone studying the dual problem of choosing a complexity measure out of a particular family which makes the chosen instance simple. For a single vector, the problem is less interesting, but for simultaneously making a whole family of vectors as sparse as possible I imagine it gets difficult fast. :grinning_face_with_smiling_eyes:
In the little work I've done in this direction - testing experimentally to see if this approach can produce images with finer resolution - we've rigged things in our favour, by imaging toy systems that we know are sparse (simple) with respect to a chosen basis. It seems like we can make great images in that setting, which is kind of exciting. But once one wants to apply this work to real biological systems, then yes, I think it probably gets much harder.
Incidentally, I happened to run across this article "Kolmogorov Complexity of Categories" (https://arxiv.org/pdf/1306.2675.pdf). It reminded me of your linked posts, and so I thought I might as well share a link here, in case it might be relevant.
Morgan Rogers (he/him) said:
Is there a particular reason for the move? Just curious, in case it's for a reason that you're willing to share. (I'm not on Twitter, but this is a considerable motivator for me to consider joining mathstodon)
I did this; my handle is rogers@lipn.info
Great, I'll follow you.
I gave a seminar at metauni last week, on how logic may be used to understand monoid actions. I ended up covering a fair amount of foundational material relating to my work, and if you know a little logic and algebra you may find it interesting. Here's a fun screenshot:
Screen_Shot_2022-11-03_at_9.22.57_pm_copy1.png
Metauni is hosted on Roblox, which is a platform designed for building video games, which makes it an amusingly surreal setting for a talk.
When not preparing applications for jobs, I've been working on a paper based on the discussion of (multiplicatively) idempotent rigs which started here on Zulip and progressed on Mastodon. It's led to some interesting combinatorics which I have enjoyed playing with. Otherwise, I've been continuing collaborations at LIPN whose details I'll share some other time.
Last Sunday @William Troiani and I set up a Twitch stream where we have maths research discussions live; here is our channel. We're keeping our discussion to "side projects" which enable us to learn stuff and for which we don't mind taking the "risks" of open discussion; our first project is about machine learning and cryptography.
We're planning another one this Saturday morning; I expect we'll start at around 10am (Paris time) although we're still ironing out the technical set-up so it might be later. Drop by if you like!
Just want to mention that I am the person who started the discussion about idempotent rig by asking if there exists a non commutative idempotent rig. If ever someone care about little picky historical details. I've already seen a preprint where they mention my phd project without telling my name in a cunning way and I thought it lacks of such a precision that is surprising given that math is supposed to be the field of human knowledge with the highest standard of rigour. Surprisingly, the same exact kind of oversight appeared a second time on Mastodon last time! I'm interested to see if it happens a third time :thinking:. (sorry for this message, I just think it could be fair to mention that I started this discussion and at the end I just want to get a job one day). I think you said one time that you are interested by ethics so think to what is the best to do here. I think that there is some value in asking the questions in math even if you don't answer them or even if it's just to point out something interesting with a simple question. It was a simple question but it started your further discussions and all the process and then your idea of making a paper. I don't know what is the best to do in such case to explain the story. Maybe simply say how things happened in the simplest way: "A discussion started when ... asked about noncommutative idempotent rigs and then we became interested by the free idempotent rig to answer this question and understand more about these objects... ". That's an editorial suggestion. If you think it's ethically better not to mention my name, then it would then be the best thing to do of course. I'm not so much knowledgeable in ethics so I can't say and I'm seeking for your help on this point.
Yep, don't worry, I say you're the one that kicked things off!
Thanks, I appreciate!
I also mention everyone who participated in the discussion both on Zulip and Mastodon too, at least as far as I could keep track (the Mastodon threads are dispersed enough that I could have missed someone who replied to a comment in the middle of the discussion without me noticing, but I did my best..! Zulip is definitely easier to use for this kind of record keeping :sweat_smile: )
That's nice :)
It's certainly better to give people credit for things like this. It may seem like an insignificant thing until you yourself raise a question and then people go ahead and write a paper about it and don't mention you.
I have never had anyone email me and complain that I cited them in a paper of mine. :upside_down:
Morgan is mentioned here.
Heh thanks for the publicity :innocent: I also got married last month so it's an exciting period in my life.
Double congrats Morgan!!
With the remains of a very teaching-heavy first semester almost swept up, I think I can post an update on what I'm working on!
If you wondered about Binary Operation, unfortunately @William Troiani went back to Australia at the end of November. However, he is still enthusiastic about the streaming research concept, so maybe we'll find a time slot in the week that can make it work? I won't make any promises about this, though.
And by "just write it", he means "just write it for the forth time".
Morgan Rogers (he/him) said:
- Two theoretical computer science papers with collaborators here at LIPN that got rejected from FoSSaCS last month are back on the table. Neither was a fully finished product, so this was going to happen anyhow, but the rejections were more motivating than disappointing, so we're keen to knock out the remaining work!
- Before the holidays I met up with Ivan Di Liberti and we pretty much got to the bottom of a paper that we've been hacking away at for what must be over 2.5 years by now. It's about Deligne's completeness theorem in topos theory. We just need to write it now...
Just ( :laughing: ) 10 weeks later, those computer science papers have been resubmitted to a conference (after a great deal of improvement!) and my paper with @Ivan Di Liberti is now up on arXiv! Go have a look at that if you've ever wondered why some toposes have enough points...
Hi folks. I've had a mostly relaxed summer, but some work trickled out nonetheless. Here's what I'm up to:
I appreciate a lot that you mentioned that I started the discussion on Zulip and nForum, asked an interesting question etc... in the preprint on multiplicatively idempotent rigs. For sure, this can only help me (I mean, you know you want people to know your name in order to get a job one day). Now, I should say that I didn't choose the name "Boolean rig" as you wrote. This name must come from Toby Bartels who created https://ncatlab.org/nlab/revision/Boolean+rig/1 in 2009 and completed it alone until revision 5 in 2020.
You can read in the entry:
"References: There seem to be none. The ‘Boolean semirings’ in the literature (by which I mean, what Google found for me that wasn't cloaked) are something much more complicated than what we're looking at here." which should mean that he chose this name himself.
A sensible correction, I'll say "used the term" rather than "chose to call them".
don't hesitate to tell me if you spot any typos for when I upload a corrected version at the end of this week.
You can get LaTeX to work in arXiv abstracts, so the abstract can say instead of what it now says, . Maybe you should use
\(r^2 = r\)
instead of
$r^2 = r$
I forget, but you can test out your abstract while reposting your paper and see what works. This is pretty important since this is what most people first see when they meet your paper.
Typo:
Accordinly, the main problem
I don't think you want to end this sentence with an ellipsis; I think a period is better:
a definition inconsistent with the usual definition of Boolean rings given above...
The parentheses here are confusing to me:
Correspondingly, we can construct the free mirig on A as (a quotient of) the free commutative monoid on the free idempotent monoid on A.
Either taking a quotient is essential, in which case I wouldn't parenthesize "a quotient of", or it's not, in which case I'd delete the parenthetical remark entirely. As it stands, the reader is left wondering why you parenthesized the remark.
With all this machinery in place, we can at last change gears and move onto mirigs.
You mean "move on to mirigs": we are proceeding to the topic of mirigs. "Moving onto mirigs" suggests that we're about to hop on top of a mirig.
(This is related to pet peeve of mine: in a clear indication that the apocalypse is nigh, people have started writing things like "To logoff, press this." Also see check out vs checkout.)
I like this paper a lot: I'll admit I was not motivated to work through the more technical sections, but everything up to (and not including) Section 2.1 was extremely easy and fun to read, thanks in part to your explicit use of a narrative - how a bunch of people got together and worked on this problem - but also thanks to lucid, patient explanations of all the fundamentals.
Thanks for catching those errors and reading as far as you did @John Baez !
Sure. By the way, I realized that somewhere around Section 2 it would be nice to see an outline of the argument to come, perhaps highlighting conceptually interesting aspects. Before that section everything is very easy for me to read. Starting in Section 2.1 it becomes very technical, with lemmas like
For any , has exactly subsemigroups of height at most 2, all of which are replete.
You probably can't avoid this, but you could help the readers (or me at least) by sketching out the plan ahead of time, so we can more easily tell why you are proving all these lemmas.
Agreed, I think I need to make it clearer that:
I was thinking of a paragraph-long outline of the whole plan of attack, introducing the main relevant entities (thickets? etc.) without precise definitions, just saying roughly how you use them to get the job done. I don't know where this 'plan of attack' should be... maybe at the start of section 3, maybe it's there somewhere and I missed it... but before joining you in brutal hand-to-hand combat readers may want to know what the plan is. My impression of the paper was that everything was nice and easy until all of a sudden the battle had begun.
There is an outline in the introduction, on page 2, but perhaps it needs more detail?
No, I'll put the explanation at the start of Section 2.4 rather than disrupting the outline. I think I might even separate Section 2.4 and 2.5 into a separate section.
Ah, I had quickly skimmed over that not knowing how important it would be, quickly moving on to the easy fun stuff that came next. I often find that a more detailed intro to what will happen in each section is useful at the beginning of that very section. E.g. "In this section we prove that every bramble can be contructed by iterated central extensions of irreducible brambles called 'twigs'. By counting twigs (Lemma 3.8) this gives a formula for the number of brambles with n twigs (Thm. 3.12)."
Although it's nonsense, that terminology is amusingly consistent with what's in my paper haha :rolling_on_the_floor_laughing:
I almost started looking for twigs and brambles on the nlab :D
You are free to use them in your next paper.
Brambles are important and established in graph theory, though (or maybe that's the joke?)
Morgan Rogers (he/him) said:
No, I'll put the explanation at the start of Section 2.4 rather than disrupting the outline. I think I might even separate Section 2.4 and 2.5 into a separate section.
Here's what I added.
image.png
Thanks again for the input.
That's an excellent road map and a nice place to put it! One wants these things to be prominent.