Can AI Know Infinity?

What mathematics reveals about the limits of today’s AI — and why AGI remains distant

Jun 14, 2025

“An equation for me has no meaning unless it expresses a thought of God.” — Srinivasa Ramanujan

When I was in high school, my mother gifted me a copy of The Man Who Knew Infinity. It told the story of Srinivasa Ramanujan—a self-taught genius from India whose mathematical intuition ran so deep he often described it as divine. At the time, I was just beginning to wrestle with the idea of infinity in math: infinitesimals in calculus, unbounded limits and asymptotes, the endless decimals of irrational numbers, the infinity of natural numbers, and the even stranger idea that some infinities are larger than others.

The book arrived at exactly the right moment. I was learning to manipulate infinity, but Ramanujan’s story hinted at something more: that infinity in mathematics wasn’t just a number to manage, but a realm to enter.

I didn’t just want to understand what he wrote. I wanted to understand what he understood.

That feeling—of awe, of pursuit, of reaching beyond the page—stayed with me. It led me deeper into mathematics. Later, into computer science. Two disciplines that, in many ways, shaped the intellectual landscape we now live in.

One gave us infinity—a way to think beyond the bounds of the finite.

The other gave us AI—a system that now mimics reasoning itself.

And so the question doesn’t feel abstract. It feels urgent. Can AI know infinity?

This isn’t just poetic curiosity. It cuts to the heart of one of the most charged debates of our time:

Are we on the brink of artificial general intelligence?

Some say yes — pointing to models that solve Olympiad problems, generate elegant proofs, or optimize algorithms better than graduate students. Others say not even close — citing the same models’ inability to navigate logic puzzles that a twelve-year-old can solve.

Nowhere is this tension more visible than in mathematics.

Math is the litmus test. It doesn’t bend to rhetoric. There are no blurry edges. Either a solution holds, or it breaks. Either an insight generalizes, or it falls apart.

If AI can “do math,” that’s a serious step toward general intelligence.1
But what kind of math? And what does “doing” mean?

Because in math, the real challenge isn’t just solving the problem.
It’s knowing what problem is worth solving.
That’s what Ramanujan saw.

The Fundamental Bug: No Inner Judge

AI today can write code, solve equations, and mimic mathematical language with fluency. But beneath the surface, there's a structural flaw that keeps it from doing real mathematics:

It doesn’t know when it’s right.

Ask an LLM to solve a logic puzzle, and it might respond confidently with a wrong answer. Or halt prematurely. Or keep going in a direction that makes no sense.

That’s not a user interface issue. That’s the core architecture.

AI is trained to continue. Not to know.

Human mathematicians are different. We often get things wrong—but we know we’re wrong, and we keep going anyway. We chase ideas we can’t yet prove because they feel right. Ramanujan wrote down hundreds of unproven formulas; many were correct, some were not, but all reflected a deeper intuition. Even his errors were structurally suggestive.

What he had wasn’t certainty. It was a sense of what needed to exist.

AI doesn’t have that. It doesn’t feel surprise. It doesn’t experience doubt. It doesn’t recognize a dead end—or a promising mistake.

And even if we one day succeed in bolting on verifiers or theorem checkers, that won’t solve the real problem. Because we still won’t know how to program what feels right.

What makes a question beautiful? What makes a proof surprising? What gives rise to the conviction that a path—though unproven—might be worth walking?

That sense of direction is not verification. It’s vision.

What AI Can and Cannot Do

Before we can ask whether AI can know mathematics, we need to ask: what can it do?

Modern AI systems, especially large language models, have become surprisingly proficient at solving a wide range of mathematical problems — from calculus exercises to Olympiad-level inequalities, especially when scaffolded with hints or structure. But when pushed beyond the known — into abstraction, creativity, or self-directed reasoning — they falter. Spectacularly.

This isn’t just a difference in difficulty. It’s a difference in kind.

Below is a typology of mathematical reasoning — where today’s AI models succeed, and where they fundamentally break.

What AI Can Do Reliably

Symbolic Procedures
Example: Differentiate f(x)=x³sin x
Why it works: Follows fixed rules; no conceptual insight needed.
Textbook Proof Replication
Example: Prove that the sum of two even numbers is even
Why it works: Matches learned patterns; templates are widely available in training data.
Scaffolded Problem Solving
Example: Solve an IMO inequality when prompted step-by-step
Why it works: Excels when nudged in the right direction; mimics known strategies.

Where AI Fails

These examples highlight current limits in mathematical reasoning, especially in creativity, abstraction, and structural understanding.

Note: “Why it fails” isn’t a hard boundary. It sketches a limitation, not a permanent impossibility. These assessments reflect extended hands-on work with today’s best models, and despite rapid progress, the chasms remain.

Constructing Counterexamples
Example: Give a non-abelian group of order 6
Why it fails: Doesn’t reason structurally; often guesses without understanding.
Reasoning About Parameter Families
Example: Analyze the qualitative behavior of the cubic family x³ - ax + b
Why it fails: Can’t generalize across parameters or detect qualitative shifts, like when a cubic suddenly gains extra real roots.
Inventing Abstractions
Example: Propose a unifying definition that generalizes pointwise and uniform convergence
Why it fails: Cannot generate new conceptual frameworks; stuck within known vocabulary.
Meta-Reasoning and Self-Verification
Example: Prove that the halting problem is undecidable, or verify whether a proof by induction includes a valid base case
Why it fails: Lacks an internal model of correctness. Confuses structural necessity with surface analogies and cannot distinguish between truth and plausibility—often producing flawed or unchecked reasoning.
Forming Conjectures
Example: Suggest a new connection between continued fractions and Pell’s equation
Why it fails: Cannot speculate meaningfully; no sense of what’s plausible but unknown.
Evaluating Elegance or Surprise
Example: Choose between two equivalent proofs and explain which is more elegant
Why it fails: No aesthetic filter; no intuition for simplicity, beauty, or surprise.

AlphaEvolve: Promise Without Perspective

Google’s AlphaEvolve offers a glimpse of what deeper mathematical search might look like. It generates variations of algorithms, evaluates them on benchmarks, and selects improvements that outperform prior versions.

This is more than pattern-matching. It involves testing, comparison, and iteration. In a sense, it’s a primitive form of self-improvement—a system that learns to evolve better solutions.

The upside is clear: AlphaEvolve integrates generation with verification. It doesn't just guess — it checks. And that feedback loop is a necessary step toward deeper AI reasoning.

But even here, the process is externally grounded. AlphaEvolve doesn’t reason about why an improvement works. It doesn’t reflect on whether the solution generalizes to broader settings or whether it hints at a deeper principle.

It’s evolution without insight.

Even the selection criterion—“performs better on X”—is defined externally. The system doesn't ask whether the improved algorithm is more elegant, or foundational, or points to a new abstraction. It lacks intentionality. It doesn't explore the space of ideas, only the space of outcomes.

AlphaEvolve marks progress. But it hasn’t crossed the threshold. It plays with variation. But it doesn’t originate purpose.

To be fair, some approaches — like neural-symbolic reasoning or theorem-proving via language model outputs — aim to address these limitations. Yet even these remain bounded by external verification, not internal judgment.

The Apple Paper: When the Illusion Breaks

Apple's recent research paper, The Illusion of Thinking, tested modern LLMs on classic logic puzzles—Tower of Hanoi, river-crossing tasks, and other structured reasoning challenges.

The results? Performance dropped to zero.

Children can solve these puzzles. AI could not.

Even when given the correct algorithm, models failed to apply it. They produced shorter, shallower responses as difficulty rose.

This isn’t a bug. It’s the architecture. The models do not "think."

They simulate thinking. But they don’t test their reasoning. They don’t correct themselves. They don’t know they’re wrong.

The Apple paper exposes the limits of surface reasoning. These systems generate fluent approximations of thinking, but collapse when they need to sustain internal logic over multiple steps.

This is where the illusion breaks. Language isn’t thought. And coherence isn’t correctness.

Even defenders of AI capability concede the point: models behave like they’re “thinking” until the scaffolding is removed. The moment ambiguity, recursion, or inference is required, they flail.

It’s not a problem of scale. It’s a problem of architecture. What’s missing is persistence of thought—the ability to hold a structure in mind, test it, and revise it.

Until models can navigate ambiguity, apply rules with purpose, and revise their logic, they won’t reason. They’ll only perform the appearance of reasoning.

This failure to sustain internal logic reveals the true dividing line in mathematical thought.

Judgment Before Execution

This essay began with infinity, not because it’s hard to compute, but because it marks a deeper shift in how we think. It’s where procedure ends and judgment begins.

That’s the real fork in mathematical reasoning. Not between what’s easy or hard, but between what is given and what is constructed.

Much of mathematics isn’t about solving a problem handed to you. It’s about deciding what the problem should be. What structure is worth studying? What pattern is trying to be seen? What generalization brings clarity?

This is not a mechanical step. It’s not about plugging into a formula or applying a learned move. It’s about seeing a direction where none yet exists.

In chess, the board is fixed. In protein folding, the energy landscape is physical. But in mathematics, the terrain itself is invented. Definitions, objects, analogies — these aren’t constraints. They’re choices. The path isn’t mapped. The map is the path.

This is where AI still falters. It excels at execution — following rules, generating formal steps, even solving highly structured problems. But it lacks judgment. It does not ask whether a problem is worth solving. It does not reframe the question. It does not choose the terrain.

In my earlier essay, The Anatomy of AI Work, I described this distinction in another context. AI is becoming increasingly capable at action-level tasks — optimizing, executing, refining. But decision-level tasks — like framing, interpreting, and generalizing — remain elusive.

Mathematics makes this distinction stark. You can automate the proof of a lemma. But identifying the right lemma? That’s something else.

The most profound steps in mathematics are rarely just answers. They’re questions that reorganize understanding.

That’s why judgment is not a luxury. It’s the heart of discovery. And until AI learns not just how to walk the path, but how to invent the trail, it will remain on the outside — imitating thought, but not engaging in it.

This isn’t just a limitation of AI. It’s a risk for us. In a recent essay, AI and the Erosion of Knowing, I explored how over-reliance on tools that execute well can slowly atrophy our ability to ask, judge, and imagine. If we outsource not just the steps, but the framing of the question, we may lose the very skill that makes mathematics — and knowledge itself — creative.

What’s at stake isn’t just what AI can do. It’s what we stop doing, once it can.

Why Gödel Still Matters

Gödel’s incompleteness theorems didn’t just shatter the dream of a complete formal system. They redefined what it means to know.

His proof used only finite tools—arithmetic, syntax, and symbolic encoding. Yet what it uncovered was infinite: in any sufficiently rich formal system, there are true statements that can never be proven within it.

The problem isn’t the tools. It’s the boundaries.

Some truths don’t resist proof because they’re too complex. They resist because the system can’t even see the need to ask.

And that insight cuts to the heart of the AI debate.

Large models manipulate symbols, match patterns, optimize objectives—but they don’t ask: Is this system enough?

Mathematicians do. We construct new frameworks, challenge assumptions, and pose questions that stretch the boundaries of what we thought was expressible.

But—and this is crucial—even mathematicians can’t answer every question. That’s what Gödel showed. There are truths we feel are true but cannot yet prove. Questions that point to real structure but remain unresolved.

The difference is: we know we don’t know.

We navigate with intuition, taste, and judgment. We recognize when something matters—even if it escapes our grasp. That awareness, that sense of what's missing, is not mechanical. It’s a distinctly human kind of seeing.

AI doesn’t have that. Not yet.

To step outside a system isn’t just to outgrow it. It’s to realize there is always something beyond.

That’s the leap Gödel made. That’s the mystery Ramanujan lived with. And that’s the frontier AI has not crossed.

From Real Numbers to New Realities

If AI someday learns to reason, it won’t be because it solves a hard problem. It’ll be because it learns to see differently — to reframe what counts as a problem in the first place.

That’s what human mathematicians have done for centuries. The greatest shifts in mathematical thought didn’t come from executing harder calculations. They came from changing the lens entirely:

From Real to Complex Numbers: For centuries, √−1 was considered meaningless. But extending the number line into the complex plane didn’t just “solve equations” — it revealed symmetries, connected algebra and geometry, and gave rise to modern physics.
From Euclidean to Riemannian Geometry: Euclid’s fifth postulate governed geometry for two millennia — until mathematicians asked, What if it doesn’t hold? Riemann’s answer didn’t just alter geometry. It laid the foundation for Einstein’s general relativity.
Cantor and the Infinities: It wasn’t obvious that some infinities are “larger” than others. Cantor’s diagonalization shattered the intuition that infinity was monolithic. In doing so, he built the modern theory of sets — and faced deep resistance from his peers, who found the idea too radical.
Turing and the Machine: Alan Turing didn’t prove a theorem. He invented a new kind of question: What does it mean for something to be computable? The Turing machine wasn’t just a model of calculation — it was a model of limits.
Grothendieck’s Revolution: In the 20th century, Alexandre Grothendieck reimagined algebraic geometry by inventing new abstract structures — sheaves, schemes, and topoi. These weren’t tools to solve old problems faster. They reformulated what the problems were.

These shifts weren’t about finding answers. They were about redefining the space of what could be asked.

That’s not execution. That’s invention.

And until AI can make those kinds of moves — not just follow rules, but reinvent the terrain — it won’t do mathematics the way humans do.

It may answer questions. But it won’t ask the ones that matter.

Mathematical mastery of infinity often means knowing when to tame it — not to reach endlessly, but to choose where to stop.

Taming Infinity: A Human Art

Of course, mathematicians don’t just dream. They also tame.

Infinity isn’t just a symbol of the unknowable. It’s a landscape we’ve learned to navigate — not by eliminating it, but by shaping its contours.

Modern mathematics is filled with tools that bring the infinite within reach. Here are just a few:

Compactness Theorem: Shows that infinite consistency can be captured by finite fragments — a cornerstone of model theory.
Cantor’s Diagonal Argument: Reveals the uncountable through a simple, finite maneuver — a blueprint for thinking beyond the enumerable, and the conceptual seed behind Gödel’s incompleteness and Turing’s halting problem.
Noetherian Induction: Tames infinite descent in algebra and geometry — a finiteness principle that undergirds modern theories like Grothendieck’s schemes.
Ramsey Theory: Shows that in any sufficiently large structure, pattern is not optional but inevitable — a principle that echoes through logic, combinatorics, and theoretical computer science.
Fermat’s Last Theorem (Wiles): A question about integers — deceptively simple — was resolved only by building a vast web of modern number theory: elliptic curves, modular forms, and infinite Galois representations. The infinite wasn't sidestepped. It was harnessed.
Poincaré Conjecture (Perelman): A century-old question about the shape of three-dimensional space was settled through Ricci flow — a geometric evolution equation that smooths out infinite curvature and complexity over time. Perelman's proof traversed the infinite, but delivered a finite, complete insight.

These aren’t just mathematical tricks. They are acts of vision — showing how infinite complexity can be transformed into finite understanding.

The point isn’t that infinity disappears. It’s that we’ve learned to fold it, reshape it, and hold it — without letting it slip through our hands.

The Real Bottleneck

The challenge isn’t just scale. Or compute. Or optimization.

It’s the ability to ask: What needs to be discovered here?

The deepest insights in mathematics rarely come from wandering further into infinity. They come from knowing when — and how — to stop. To name a structure. To frame a question. To trace a pattern through chaos and say: Here. This matters.

That move isn’t algorithmic. It isn’t forced. It’s not a brute-force search. It’s a leap.

A leap Ramanujan made again and again — not because he had proof, but because he saw something worth proving.

Hardy once said that Ramanujan’s theorems “must be true, because if they were not true, no one would have had the imagination to invent them.”
Their collaboration was full of tension — Hardy insisted on proof; Ramanujan followed intuition. But over time, even Hardy admitted: “I had to trust his insight.”

Another time, when Ramanujan was ill in bed, Hardy visited and mentioned that his taxi’s number — 1729 — seemed dull. Ramanujan immediately replied: “No, it is a very interesting number; it is the smallest number expressible as the sum of two cubes in two different ways.”

That’s not computation. That’s not search.
That’s intuition on fire.2

AI can derive formulas. It can test variations.
But it doesn’t see the spark.
It doesn’t get excited. It doesn’t get suspicious.
It doesn’t chase an idea that isn’t yet real.

Until machines can do that, they’ll compute.
But they won’t wonder.3

That’s the real bottleneck.

The ability to reach toward the infinite — and know when to stop.

The Intelligence Loop

Discussion about this post