ARC-AGI-3 Is Here — And AI Is Already Struggling
The third edition of the ARC-AGI benchmark dropped, and frontier models are once again humbled. ARC-AGI tests abstract reasoning that can't be pattern-matched from training data — it requires genuine novelty. The fact that a new benchmark edition is necessary at all tells you something: the field keeps redefining the ceiling to avoid admitting models aren't climbing it. But the gap between "impressive on benchmarks" and "actually reasoning" remai
HN / Radar scan, Mar 26