TOPOpenAIAI for MathErdos

An OpenAI Model Just Broke an 80-Year-Old Erdős Conjecture on Its Own — This Is What 'AI Doing Math' Looks Like

On May 20, OpenAI said one of its internal general-purpose reasoning models autonomously disproved the unit distance conjecture, a central problem in discrete geometry Paul Erdős posed in 1946. The 125-page proof leaned on deep algebraic number theory (Golod-Shafarevich theory, infinite class field towers). Fields medalist Tim Gowers called it 'a milestone in AI mathematics'; Princeton's Will Sawin pinned the gain at n^(1+δ), δ≥0.014.

2026년 5월 25일 (월)·9분 소요

OpenAI reasoning model autonomously disproves the Erdős unit distance conjecture — Source: OpenAI

Here's the deal: an AI took a problem statement and broke an 80-year-old conjecture by itself

On May 20, OpenAI announced that one of its internal general-purpose reasoning models autonomously disproved the unit distance conjecture — a central problem in discrete geometry that Paul Erdős posed in 1946. This isn't another benchmark score. The model knocked over an open problem at the heart of a math subfield, without a human walking it through the steps. That's why people are calling it a first.

The problem is easy to state. Place n points in a plane. What's the maximum number of pairs that are exactly distance 1 apart? For nearly 80 years, the field's intuition was that square-grid arrangements are essentially optimal — lay the points out in a regular lattice and you get the most unit-distance pairs. The model built an infinite family of configurations that beats the grid, refuting the bound everyone assumed was right.

The shocking part isn't the result — it's the method. The model didn't brute-force its way by nudging grids around. It connected the Golod-Shafarevich criterion (proved in 1964) and infinite class field towers — deep machinery from algebraic number theory — to an elementary geometry problem. Combinatorial geometers hadn't even thought to reach for those tools. Out came a 125-page proof, checked by outside mathematicians.

And the reaction is heavy. Fields medalist Tim Gowers called it "a milestone in AI mathematics." Princeton's Noga Alon called it "an outstanding achievement." Princeton's Will Sawin quantified the improvement and published a companion paper the same day. A follow-up paper on the disproof (arXiv 2605.20695) is already circulating, and the math community is actively debating it.

The players — Erdős, Gowers, Sawin, and a 'general-purpose' reasoning model

Paul Erdős (1913–1996). The most prolific mathematician of the 20th century — over 1,500 papers, and the namesake of the "Erdős number." The unit distance problem he posed in 1946 is one of the most famous open problems in combinatorial geometry, a hard-core puzzle whose upper and lower bounds barely budged for 80 years. Erdős himself famously attached a prize to it.

Tim Gowers. Fields medalist (1998), a giant in combinatorics and functional analysis, and a long-time public voice on AI's role in math. When he calls this a "milestone," it's not a courtesy — it reads as a judgment that the model crossed the line from "solving competition problems" to "research-grade discovery."

Will Sawin. Princeton mathematician who took the model's disproof and pinned the improvement at n^(1+δ), δ≥0.014 in a companion paper. That matters because it converts a qualitative claim ("better than the grid") into a hard mathematical statement about how much better. In other words, a real human-AI collaboration loop actually ran: the model produced the construction, a human turned it into theory.

OpenAI's 'general-purpose' reasoning model. The key point: this wasn't a math-only fine-tuned system. Per OpenAI, the model (1) wasn't trained for this problem, (2) didn't search for existing solutions, and (3) didn't get step-by-step human guidance. It took the problem statement and produced a 125-page proof on its own. Unlike a theorem-proving specialist like AlphaProof, this was a general model — that's the differentiator.

What it actually broke, and how

The structure. The unit distance problem asks for u(n), the max number of distance-1 pairs among n points. Erdős conjectured the upper bound looks roughly like n^(1+c/loglog n); separately, the (now-broken) belief was that the square grid is nearly optimal. The grid yields about n^(1+c/loglog n) pairs, and for decades nothing beat it.

What the model did. It found an infinite point family that produces asymptotically more unit-distance pairs than the grid — proving that for large enough n, a configuration exists that beats the lattice. Sawin's δ≥0.014 means this edge isn't a rounding error; it's a genuine polynomial-scale improvement.

— Why algebraic number theory? This is the jaw-dropper. The unit distance problem is geometric — it's about distances in the plane. The model translated it into number-theoretic structure. The Golod-Shafarevich criterion was built to tackle abstract questions like "does an infinite class field tower exist?" The model used it to extract point arrangements, sitting on specific algebraic-integer structures, where unit-distance pairs explode. That bridge between combinatorial geometry and algebraic number theory is so counterintuitive that even human researchers only saw the connection in hindsight.

Verification. The 125-page proof was reviewed by outside mathematicians, and a related paper (2605.20695) is on arXiv. But not everyone is cheering. Skeptics raise (1) the scope and reproducibility of the verification, (2) the precise definition of "autonomous" (where does the model end and human input begin?), and (3) whether the result was marketed too aggressively. That's healthy skepticism.

Item	Old consensus	This result
Optimal arrangement	Square grid	Infinite family that beats the grid
Improvement	—	n^(1+δ), δ≥0.014 (Will Sawin)
Tools used	Combinatorial geometry	Golod-Shafarevich theory, class field towers
Proof length	—	125 pages
Author	Human mathematicians	OpenAI general reasoning model (autonomous) + human verification

Who gains what

OpenAI. First, a narrative shift. "AI solves math olympiad problems" is impressive but those problems already have answers. Breaking an open research problem is a qualitatively different asset. Second, proof of capability — it signals to enterprise and research markets that GPT-5-class reasoning models can produce genuinely new knowledge. Third, credibility — public endorsements from top authorities (Gowers, Alon) are reputation money can't buy.

Mathematics. It strengthens the "AI as collaborator" view. Just as Sawin took the model's output and theorized it, expect the "model proposes candidate constructions, humans verify and write them up" workflow to spread. Faster attacks on hard problems mean higher math productivity overall.

The 'AI for Science' camp. This becomes a powerful reference for the claim that AI can make real discoveries in drug design, materials, and physics. Paired with Jack Clark predicting a "Nobel-level discovery within 12 months" the same week, it builds a sense that AI for Science has moved from slogan to track record.

Even the skeptics gain. Paradoxically, this is a good case for the cautious crowd too. Debating the definition of "autonomous," reproducibility, and verification scope will sharpen how we evaluate AI discoveries. That pressure is what makes future announcements more transparent.

Precedents — wins and failures

Win: DeepMind AlphaProof / AlphaGeometry (2024). Google DeepMind unveiled theorem-proving systems that scored at silver-medal level at the IMO in 2024. But those were specialists solving competition problems with known answers. OpenAI's case ranks a notch higher: a general model breaking an open research problem.

Win: computer proofs of the Four Color Theorem and Kepler conjecture. The 1976 Four Color Theorem and 2014 Kepler conjecture (Flyspeck) were completed by machines checking vast case sets. But there, machines executed human-designed procedures. Here, the model chose its own tools (algebraic number theory) — a decisive difference.

Disputed / overstated AI math claims. History is littered with "AI solved math" headlines that turned out inflated — the model only worked inside a human-built frame, or the result didn't reproduce. The current caution around "autonomous" comes from that learned wariness. Which is exactly why outside verification and Sawin's companion paper matter so much.

How rivals counter

Google DeepMind. The most direct competitor. Expect it to merge AlphaProof / AlphaGeometry / Gemini into a "general model attacks open problems" push. Right after I/O 2026, a "Gemini cracked problem X" rebuttal wouldn't be surprising. DeepMind has the cred — AlphaFold even won a Nobel in Chemistry (2024).

Anthropic. Could push Claude's reasoning toward math and science discovery. But Anthropic leans hard on a "safety and trust" position, so it might differentiate via "verifiable AI math" rather than discovery bragging. Jack Clark's Oxford lecture the same week sets that table.

Meta FAIR and Chinese labs. Meta via open-source math models; DeepSeek and others via their own reasoning models (the R-series) could announce "we cracked problems too." But getting public verification from top authorities is the real gate — score-bragging alone won't match this impact.

The academy itself. Some mathematicians may reinterpret the result as "doable without AI" or "the key idea was human-supplied." That's less competition than verification — and it'll only make the bar for "AI discovery" stricter.

So what actually changes — by persona

Math and theory researchers. A signal that workflows are shifting: throw candidate constructions at the model, keep verification, theory, and write-up for yourself. Recommendation — pick open problems in your field where "construction / counterexample search" is the crux, and experimentally hand them to a reasoning model.

AI engineers and researchers. The lesson is that a general reasoning model chose its own tools autonomously. The pattern of attacking domain problems without fine-tuning is direct inspiration for agent design (letting the model decide which tool to use when).

Investors and enterprises. This may be the inflection where "AI for Science" crosses from slogan to results. Next targets: fields with huge search spaces and clear verification — drug discovery, materials, chip design. Just always check for outside verification before buying the "autonomous" claim.

General readers. No direct impact, but "AI finds connections humans missed" just moved from abstract theory to demonstration. At the same time, the "what counts as autonomous" debate is a reminder to read AI-discovery news with a critical eye.

Regulators and policy. "AI produces new scientific knowledge" raises fresh governance questions — research integrity, authorship, reproducibility standards. AI's status in author lists and reproducibility requirements for discoveries will become real academic-policy fights.

References

Frequently Asked Questions

What is the article "An OpenAI Model Just Broke an 80-Year-Old Erdős Conjecture on Its Own — This Is What 'AI Doing Math' Looks Like" about?

On May 20, OpenAI said one of its internal general-purpose reasoning models autonomously disproved the unit distance conjecture, a central problem in discrete geometry Paul Erdős posed in 1946. The 125-page proof leaned on deep algebraic number theory (Golod-Shafarevich theory, infinite class field towers). Fields medalist Tim Gowers called it 'a milestone in AI mathematics'; Princeton's Will Sawin pinned the gain at n^(1+δ), δ≥0.014.

Why is this news important?

On May 20, OpenAI announced that one of its internal general-purpose reasoning models autonomously disproved the unit distance conjecture — a central problem in discrete geometry that Paul Erdős posed in 1946. This isn't another benchmark score. The model knocked over an open problem at the heart of

Which companies or organizations are mentioned in this article?

The key entities covered in this article include OpenAI, AI for Math, Erdos, Unit Distance Conjecture, Discrete Geometry, Reasoning Model, Tim Gowers.

When was this article published?

This article was published on 2026-05-25 by spoonai.

What is the original source of this article?

The original source is OpenAI (https://openai.com/index/model-disproves-discrete-geometry-conjecture/).

What are the main topics covered in this article?

This article covers: Here's the deal: an AI took a problem statement and broke an 80-year-old conjecture by itself, The players — Erdős, Gowers, Sawin, and a 'general-purpose' reasoning model, What it actually broke, and how, Who gains what, Precedents — wins and failures.

An OpenAI Model Just Broke an 80-Year-Old Erdős Conjecture on Its Own — This Is What 'AI Doing Math' Looks Like

Here's the deal: an AI took a problem statement and broke an 80-year-old conjecture by itself

The players — Erdős, Gowers, Sawin, and a 'general-purpose' reasoning model

What it actually broke, and how

Who gains what

Precedents — wins and failures

How rivals counter

So what actually changes — by persona

References

Frequently Asked Questions

출처

관련 기사

OpenAI's Lilli Replaces Internal Knowledge Search with AI Agents

GPT-5.4 Deep Dive — The First General-Purpose Model That Actually Uses Your Computer

GPT-5.4 Thinking Ships — 33% Fewer Tokens, 33% Fewer Errors, and the Reasoning AI Tipping Point

Here's the deal: an AI took a problem statement and broke an 80-year-old conjecture by itself

The players — Erdős, Gowers, Sawin, and a 'general-purpose' reasoning model

What it actually broke, and how

Who gains what

Precedents — wins and failures

How rivals counter

So what actually changes — by persona

References

Frequently Asked Questions

출처

관련 기사

OpenAI's Lilli Replaces Internal Knowledge Search with AI Agents

GPT-5.4 Deep Dive — The First General-Purpose Model That Actually Uses Your Computer

GPT-5.4 Thinking Ships — 33% Fewer Tokens, 33% Fewer Errors, and the Reasoning AI Tipping Point

AI 트렌드를 앞서가세요