Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.

Stream: community: discussion

Topic: Human math vs Formal math

Peva Blanchard (Mar 31 2026 at 21:16):

I recently stumbled upon this paper Compression is all you need which proposes a criterion to distinguish "human mathematics" (the mathematics humans discover and value) from "formal mathematics" (the totality of all valid deductions). I was a bit skeptical at first, but one of the authors is Michael H. Freedman.

Here is the abstract.

Human mathematics (HM), the mathematics humans discover and value, is a vanishingly
small subset of formal mathematics (FM), the totality of all valid deductions. We argue that HM is
distinguished by its compressibility through hierarchically nested definitions, lemmas, and theorems.
We model this with monoids. A mathematical deduction is a string of primitive symbols; a definition
or theorem is a named substring or macro whose use compresses the string. In the free abelian
monoid 𝐴𝑛, a logarithmically sparse macro set achieves exponential expansion of expressivity. In
the free non-abelian monoid 𝐹𝑛, even a polynomially-dense macro set only yields linear expansion;
superlinear expansion requires near-maximal density. We test these models against MathLib, a large
Lean 4 library of mathematics that we take as a proxy for HM. Each element has a depth (layers of
definitional nesting), a wrapped length (tokens in its definition), and an unwrapped length (primitive
symbols after fully expanding all references). We find unwrapped length grows exponentially with
both depth and wrapped length; wrapped length is approximately constant across all depths. These
results are consistent with 𝐴𝑛 and inconsistent with 𝐹𝑛, supporting the thesis that HM occupies a
polynomially-growing subset of the exponentially growing space FM. We discuss how compression,
measured on the MathLib dependency graph, and a PageRank-style analysis of that graph can quantify
mathematical interest and help direct automated reasoning toward the compressible regions where
human mathematics lives.

It's the first time I see an attempt to characterize "human mathematics", so it is very intriguing to me.

Are you aware of similar attempts? possibly from philosophers?

Ryan Wisnesky (Mar 31 2026 at 23:01):

In the proof engineering community we often say that the verification conditions arising from proving code correct are "wide but shallow", and hence easier to automate, than propositions arising from 'normal math', which are 'narrow but deep' usually. Certainly many people have noted formal verification pre 2020 seems to be more applied to code than math, I would argue for the deep vs wide reason.

John Baez (Apr 22 2026 at 02:52):

Here's a long article that I enjoyed and basically agree with:

David Bessis, The fall of the theorem economy.

It says that mathematicians should hurry up and let people know that the point of math is not just proving theorems - so that the people who claim AI can quickly make human mathematicians unnecessary by getting good at proving theorems won't fool the world.

I've known this all along, and everything I do is supposed to make that clear, so it seems a bit sad that we have to run out and tell people this... but we may need to.

Federica 🦅 (Apr 22 2026 at 06:23):

Sounds like what I did in the last collaborative YouTube video advertising CT for Quantum and AI. We gotta get better at sales here! :rolling_on_the_floor_laughing:

Morgan Rogers (he/him) (Apr 22 2026 at 12:33):

I think the article does a great job articulating the reasons that AI won't replace the actual work of mathematicians any time soon. It also strikes me that this analysis applies to many other applications to which AI is being used to replace human labour.
It's funny that Bessis is careful to defend a number of people he quotes from the potential accusation of Luddism. I'm reading Brian Merchant's Blood in the Machine at the moment, a historical account of the Luddites, and it's been illuminating to learn that the modern meaning of the term (ignorant technophobe) is inherited from the ruling class' depiction of the workers involved. In fact, these were people being pushed out of work by unregulated introduction of machinery by capitalists whose plight was consistently ignored or dismissed by the government of the time. They had no objection to technology that improved their working conditions, but they were left with essentially no choice but to start destroying the technology that deprived them of work after all other options were denied.
It may not surprise you in light of current and recent events to learn that 1) this destruction of property was very popular with the public of the time 2) the government's response was to continually ramp up the severity of the penalties for these actions rather than consider any of the proposed reforms, until it literally became punishable by death to destroy cloth-manufacturing machinery.

Ruby Khondaker (she/her) (Apr 22 2026 at 15:06):

It does seem like the Luddites have been unfairly slandered

John Onstead (Apr 22 2026 at 19:47):

I don't normally chime in on AI discussion but I feel there's something major that Bessis is missing. Correct me if I'm wrong, but Bessis' reasoning seems to be: 1. math is not merely proving theorems and manipulating symbols, but involves a conceptual element, 2. AI cannot do the conceptual element of math, Conc: therefore, AI will not fully replace human mathematicians. I actually agree with premise 1, which is the main argument he seems to defend in his article, and he defends it quite successfully. However, I think premise 2 is deeply flawed, and no explanation for this argument seems to be given in the article, it is just assumed.

John Onstead (Apr 22 2026 at 19:47):

At least from my perspective, there seems to be no barrier, in terms of the laws of physics, stopping someone from training an AI model to solve problems exactly like a human mathematician (just obviously a lot faster and more efficiently). In fact, even now, when I encounter a new math object or concept I often go to an LLM and say "Explain the conceptual intuition behind concept X, and then explain the main motivation for why I should consider X important". In many cases, the LLM is actually really good at explaining this and I walk away with a much better understanding. Of course, this is nowhere near what is needed for AI to actually do anything novel, but this seems to be less an insurmountable barrier and more a problem of finding the right AI model and training data.

My question is: why can't you train an AI model not on Lean or formal logic directly as the teams mentioned in Bessis' article did, but instead train an AI model on the intuitive and conceptual reasoning human mathematicians employ when deriving new definitions or progressing on stages in proofs? Then, when you ask such a model to solve a math problem, it starts trying to generate new definitions like a human mathematician would, and at the end can not only spit out the solution to the problem, but an intuitive explanation for the motivation behind each and every step it used in the proof, as well as an intuitive explanation for every new mathematical object or concept it formally defined?

Peva Blanchard (Apr 22 2026 at 20:29):

John Onstead said:

My question is: why can't you train an AI model not on Lean or formal logic directly as the teams mentioned in Bessis' article did, but instead train an AI model on the intuitive and conceptual reasoning human mathematicians employ when deriving new definitions or progressing on stages in proofs?

I think this is the intent of the paper Compression is all you need I mentioned in the first post. From their abstract

We discuss how compression, measured on the MathLib dependency graph, and a PageRank-style analysis of that graph can quantify mathematical interest and help direct automated reasoning toward the compressible regions where human mathematics lives.

Peva Blanchard (Apr 22 2026 at 20:36):

I just finished reading Bessis's blog post. There is a lot of interesting things (and punchlines) to unpack!

I might have missed the sentence, but I don't think he said that AI will never be able to do "human mathematics". He insisted instead on clarifying what it means for humans to do maths, and letting non-math humans know that it's not really about proving theorems.

He mentioned, however, some things that seem to relate to the paper Compression is all you need, e.g., canonization, i.e., building definitions that can be reused later (for the sole purpose of doing more maths in the future).

Peva Blanchard (Apr 22 2026 at 20:43):

The part that caught my attention, about "doing maths humanly", is the following quote

My proposed conceptualist resolution is a rephrasing of Thurston’s view: mathematics does rely on a meaningless game of formal symbols, but we only play this game because we project meaning onto it.

Meaning is a cognitive phenomenon—a product of our neural architecture—and not a direct access to transcendence.

When we “do math”, we manipulate formal expressions and gradually develop an intuitive feel for what they represent, as if they were pointers to objects that “existed” in a Platonic sense. Platonists take this neuroplastic side-effect at face value. Formalists view it as accessory. Conceptualists like me recognize mathematics as a critical cognitive infrastructure of the human species.

I think this counts as an example of an attempt to characterize "human mathematics".

Alex Kreitzberg (Apr 22 2026 at 20:47):

I tentatively like intuitionism, it seems like the philosophical motivations for it competes in the same space as "conceptualism". In any case, I don't like brushing a lot of subtle issues under the "it's psychology" rug (but that's an aside, I liked the article).

John Baez (Apr 23 2026 at 01:24):

John Onstead said:

I don't normally chime in on AI discussion but I feel there's something major that Bessis is missing. Correct me if I'm wrong, but Bessis' reasoning seems to be: 1. math is not merely proving theorems and manipulating symbols, but involves a conceptual element, 2. AI cannot do the conceptual element of math, Conc: therefore, AI will not fully replace human mathematicians.

I don't think he claims 2. He seems to be saying that right now, humans are better at the conceptual aspect of math than computers - so we, meaning mathematicians, shouldn't throw away this advantage by telling people that all that really matters is proving theorems.

David Michael Roberts (Apr 23 2026 at 01:49):

Terry Tao is I think sub-tooting that blog in the thread here: https://mathstodon.xyz/@tao/116450581967483825 with a sample:

As a crude first approximation, the problem-solving component of mathematical research (which, one should stress, is not the only aspect of such research) can be decomposed into three subcomponents:

Proof generation (finding a solution to a given problem);

Proof verification (checking that a proposed solution actually works); and

Proof digestion (understanding the essence of a solution, placing it in context with previous literature, summarizing and explaining it effectively, and gaining insights on other related problems and topics).

and comments about how 1 and 2 are now something that AI systems are making big progress on, but 3 is something that is just as important, something more open-ended and subjective. And perhaps at risk:

In fact, with the current cultural incentives that reward the first authors to "solve" the problem, rather than the later authors who "digest" the solution, one may end up with the perverse situation in which an AI-generated (and formally verified) solution to an problem that is presented to the community without any significant digestion may actually inhibit the progress of the field that the problem lies in, by discouraging any further attempts to work on the problem, simplify and explain the proof, and extract broader insights.

John Onstead (Apr 23 2026 at 03:25):

Peva Blanchard said:

I might have missed the sentence, but I don't think he said that AI will never be able to do "human mathematics". He insisted instead on clarifying what it means for humans to do maths, and letting non-math humans know that it's not really about proving theorems.

John Baez said:

I don't think he claims 2. He seems to be saying that right now, humans are better at the conceptual aspect of math than computers - so we, meaning mathematicians, shouldn't throw away this advantage by telling people that all that really matters is proving theorems.

Ah, I see. Thanks for clearing that up!

Ryan Wisnesky (Apr 23 2026 at 03:29):

GH Hardy would no doubt say that there is as much point to an AI understanding math conceptually as there is to an AI appreciating art instead of a human or a robot going to the gym instead of a human.

Martti Karvonen (Apr 23 2026 at 08:48):

Ryan Wisnesky said in #community: discussion > Human math vs Formal math:

GH Hardy would no doubt say that there is as much point to an AI understanding math conceptually as there is to an AI appreciating art instead of a human or a robot going to the gym instead of a human.

Sure, but professionally going to the gym or appreciating art is kind of rare, especially on taxpayer money. Mathematicians need to justify to the outside world why having human professionals focus on conceptual understanding is valuable (to society at large and not just those doing the understanding) in a possible era of automated problem-solving. Not saying this justification doesn't exist, just that it needs to be given over and over again.

Pablo Donato (Apr 23 2026 at 09:07):

Note that two authors of the "Compression is all you need" paper, Eve Bodnia and Michael Freedman, are respectively CEO and "Chief of Mathematics" of the AI company Logical Intelligence. So it seems to me that their goal is to some extent to automate parts of human mathematics, as maybe witnessed by the following sentence (page 4):

The goal is to give AI agents exploring formal mathematics a sense of direction: stay where compression is possible.

Another clue is that the title "Compression is all you need" is a nod to the paper that introduced the transformer architecture used in LLMs, Attention Is All You Need.

Graham Manuell (Apr 23 2026 at 09:26):

Ruby Khondaker (she/her) said:

It does seem like the Luddites have been unfairly slandered

I don't think whether the historical Luddites were 'good' or 'bad' is particularly relevant to the present situation, but in any case, I found this essay pretty convincing that the bad reputation of the Luddites is deserved after all.

Posina Venkata Rayudu (Apr 23 2026 at 10:00):

Given that: Natural structure with respect to a given doctrine is a precise mathematical model for a very general scientific process of concept formation, do you still think conceptualization can't be relegated, if not now, then sometime later, to machines? At the risk of self-promoting, if measurements of properties of a given category of particulars can be construed as representable functors, then the corresponding representing objects along with their incidence relations constitute the theory subcategory of the given category, which readily lends itself to be relegated to computers, although not to the current variant AI fixated on particulars, not realizing that any particular is infinite, and hence the scientific resort to abstracting concepts with respect to a limited doctrine (cf. Maxwell; see Fodor's dog for an entertaining account). The theory theory of concepts championed by Professor Alison Gopnik aligns with above mathematical abstraction of concepts (cf. functorial semantics of Professor F. William Lawvere, sketches of Professor Andree Ehresmann, and descent of Grothendieck). The profound resemblance / parallels between ordinary concepts and scientific conceptualization have been noted by, among others, Einstein, Schapira, and Professor F. William Lawvere.

Lastly, the conceptual-formal duality involves two adjoints pairs: theory-representations and theory-presentations, with theory as the common category making words / symbol strings meaningful.

In closing, most AIs agree that their fixation on particulars is pathological.

Matteo Capucci (he/him) (Apr 24 2026 at 10:33):

Peva Blanchard said:

I recently stumbled upon this paper Compression is all you need which proposes a criterion to distinguish "human mathematics" (the mathematics humans discover and value) from "formal mathematics" (the totality of all valid deductions).

They claim that a certain theorem in mathlib:

when fully unwrapped, reaches approximately $10^{104}$ primitive terms

which is so big to be absurd? Unless I misunderstood, the way they compute this number is they take the DAG of lemmas used in a theorem and 'unroll it' to its universal covering tree. Then they look at the size of this tree (presumably number of nodes, but I guess this is basically on the same order of magnitude as the leaves). I don't understand why this is a useful metric though.

I confess I didn't read through but I also don't understand why the free non/abelian monoid on a finite alphabet is a good model of compression? What has abelianity do with it?

Matteo Capucci (he/him) (Apr 24 2026 at 10:35):

Ultimately, I agree with their 'conclusion', namely that HM is a very small subset of FM, which is why logicism is absurd, and that much of mathematics is indeed managing the explosive growth of complexity by making good use of hierarchical concepts. I even wrote a blog post about it some years ago.

V Slicer (Apr 27 2026 at 19:56):

John Onstead said:

At least from my perspective, there seems to be no barrier, in terms of the laws of physics, stopping someone from training an AI model to solve problems exactly like a human mathematician (just obviously a lot faster and more efficiently).

One barrier might be Gödelian arguments, such as the Gödel-Lucas-Penrose argument. Chalmers puts it as:

The argument is given in a somewhat indirect form, involving complex procedures by which a given formal system might have evolved, but its basic structure is very simple. In a simplified and somewhat loose form, the argument goes as follows:

(1) Assume my reasoning powers are captured by some formal system F (to put this more briefly, "I am F"). Consider the class of statements I can know to be true, given this assumption.

(2) Given that I know that I am F, I know that F is sound (as I know that I am sound). Indeed, I know that the larger system F' is sound, where F' is F supplemented by the further assumption "I am F". (Supplementing a sound system with a true statement yields a sound system.)

(3) So I know that G(F') is true, where this is the Gödel sentence of the system F'.

(4) But F' could not see that G(F') is true (by Gödel's theorem).

(5) By assumption, however, I am now effectively equivalent to F'. After all, I am F supplemented by the knowledge that I am F.

(6) This is a contradiction, so the initial assumption must be false, and F must not have captured my powers of reasoning after all.

(7) The conclusion generalizes: my reasoning powers cannot be captured by any formal system.

Strictly speaking, the conclusion that must be drawn is that I cannot know that I am identical to a formal system F; in showing that I can see the truth of G(F'), we assumed not just that I am F but that I know I am F. But this is still a strong conclusion. For example, it would rule out even the possibility that we could empirically discover that we were identical to some system F - if we were to "discover" this, the reasoning would lead us to a contradiction. So even this would be threatening to the prospects of AI.

John Baez (Apr 27 2026 at 23:56):

Nothing prevents the "I" in Chalmers' argument from being an artificial intelligence. At least nothing in the argument prevents it.

Matteo Capucci (he/him) (Apr 28 2026 at 07:59):

V Slicer said:

formal system

Also I don't see how AI = formal system. Current models are nothing like that!

V Slicer (Apr 28 2026 at 08:46):

The argument by Penrose, summarized there by Chalmers, is meant to show the falsity of the computational theory of mind. This is a stronger claim that encompasses all possible algorithmic implementations including all existing large language models. Penrose would agree that 'artificial intelligence' is possible, if we're using it as actually artificial and not simply algorithmic. He argues that reality and what we know of physics isn't limited to the algorithmic so it would be in principle possible.

V Slicer (Apr 28 2026 at 08:52):

Gödel shared his view in his “Gibbs lecture” in 1951:

So the following disjunctive conclusion is inevitable: Either mathematics is incompletable in this sense, that its evident axioms can never be comprised in a finite rule, that is to say, the human mind (even within the realm of pure mathematics) infinitely surpasses the powers of any finite machine, or else there exist absolutely unsolvable diophantine problems of the type specified . . . (Gödel 1995: 310).

Peva Blanchard (Apr 28 2026 at 09:51):

My interrogation on starting this thread was more about "how do human do maths?" rather than "is it possible for AI to do maths, and how?". The paper "compression is all you need" proposes some intriguing combinatorial criterion on the output of this human activity, namely "human mathematics".

But now, I am wondering about the process of humans doing math, rather than its output. I am wondering if some sociologists (or anthropologists?) have led empirical inquiries on that topic. E.g.

what are the individual motivations, justifications?
what are the groups, their practices and rituals?
what are the artefacts and tools?
etc.

I'm sure a lot of answers to these questions will look obvious to us. But it will be nice to have a comprehensive dataset.

Peva Blanchard (Apr 28 2026 at 09:51):

In that direction, I recall Gowers' essay on the two cultures, i.e. theory-builders vs problem-solvers. But it is an essay (well-informed, for sure), not an actual empirical investigation.

Peva Blanchard (Apr 28 2026 at 09:52):

Oh, and also Mathematics without apologies, but again not an actual empirical investigation.