Wednesday, May 20, 2026

The unit distances problem

 OpenAI just announced that ChatGPT has disproved a conjecture about one of Erdos's most famous problems: the unit distance problem.

This problem is personal to me: I spent a good chunk of time during my Ph.D mulling over it, and it's what hooked me into computational geometry. Like most of Erdos's problems, it's really easy to state

Let P be a set of n points in the plane (amen). What is the maximum number of unit distances that can be achieved? 

(the amen is a running joke in my old field of computational geometry) 

Note that this is really about duplicate distances (because if you get a bunch of pairs of points at distance d, you can scale the point set so that d = 1). It's also trivial to see that the maximum is at least n-1 (just put points evenly spaced along the number line), and that the number can't be more than n-choose-2 (because that's the number of pairs). 

So what's the real number? Erdos showed using a fairly complicated construction that you can get a set of points that has $n^{1 + 1/\log\log n}$ unit distances. On the other side of it, a famous result from 1983 by Spencer, Szemederi and Trotter showed that you can't get more than $O(n^{4/3})$ distances. 

Erdos himself used a really elegant but weaker argument to show that you can't get more than $O(n^{3/2})$ distances. And the argument was cool and deceptive (in retrospect). To get a distance pair of 1, a circle of size 1 around a point must touch another point (and vice versa). Draw an edge between those two points when this happens. So here's a fun fact: you can't create a situation where two points are connected by edges to each of three other points. Or to be more formal, you can't create a situation where there's a $K_{2,3}$ bipartite graph hidden inside the set of edges. By a well known result in graph theory this means this graph can't have too many edges in it (because if it did you'd eventually find one of these special graphs): specifically no more than $O(n^{3/2})$ edges. 

So there's a limit. But what's the real limit? A long standing conjecture was that you could NOT get anything nontrivially more than n pairs. Specifically that you couldn't get $\Omega(n^{1 + \epsilon})$ pairs for any $\epsilon$. This is frustrating because the gap between this, and $n^{4/3}$ is huge. 

Turns out this conjecture was wrong. And ChatGPT proved it, by building a complicated generalization of Erdos's construction that is indeed of size $\Omega(n^{1 + \epsilon})$ for some $\epsilon > 0$. 

This was a tantalizing, infuriating, and beautiful problem that has resisted progress for a very long time, and touches on some very deep concepts in mathematics. It's really impressive that an AI system has provided a proof for it. For more on the significance of the result and some interpretation of the proof technique, check out the companion article

Thursday, May 07, 2026

The AAAI 2026 AI review experiment

 AAAI did an experiment this year where they supplemented human reviews with AI-generated reviews and solicited feedback from authors and the review hierarchy about the process. They've now written up the experiment

The paper isn't too long, and I'd encourage you to read the whole thing (or, I don't know, put it into notebookLM and make a podcast out of it!). Some interesting points stood out to me as I read the report. 

The complexity of the process

The process of architecting the AI review was not the cartoonish "hey ChatGPT review this paper for me". It was carefully structured to focus on specific elements of the review (content, readability, evaluation, setup, etc). The system had what is now standard: a second LLM that acted as a critic and was not told where the review came from, and a third LLM that has to integrate the analysis of the critic and the original review into a final review. I've heard plenty of cases where this architecture does better than just getting the review or even just having a judge. 

To be clear, the critic was only doing a 'meta review'. It didn't have access to the original paper, so its goal was mostly structural/formal: does the review have all the elements and does it avoid things like accidental author reveal, or obnoxious comments etc. 

One thing that wasn't clear from the article was how exactly the LLM was checking code, experiments, theorems and proofs, "using the code interpreter as needed". I'd want to see more details about that seemingly agentic handoff. 

The perception of the results

There's a pretty dramatic signal in the survey results (and the number of responses was decent). AI-generated reviews were viewed as better than human generated reviews along six of nine categories. Where humans did better was on not nitpicking, identifying technical errors, and providing useful suggestions, but where AI reviews did better included being thorough and providing useful suggestions for improvement (which reminds of https://www.refine.ink/)

It was interesting to see that almost across the board, authors were more enthusiastic about the AI reviews than the reviewer hierarchy. If I'm being my burnt-out AC persona, I'd say this is because authors are likely grateful to get any kind of thorough review of their paper, and man do human reviews of papers suck. 

The human-AI interaction

The survey had free form responses that were interesting from a qualitative perspective. I think this is where the report fell down a bit, because I suspect there's a rich trove of analysis to do on the assessments that people wrote in. A couple of highlighted examples though brought home the important point that perhaps the best use of these AI reviews is before submission itself, kind of like what STOC 2026 did in their experiment. Because the AI reviews are great at identifying lots of small things that a friendly pre-submission review might miss, but they don't have the same kind of judgement and taste that a person has. 

A minor notes:

  • The process cost less than $1/paper, for 30,000 submissions. That's not a bad amount to spend. But you have to wonder why reviewers can't get compensation for their work but OpenAI gets paid. 

Disqus for The Geomblog