Chollet: static benchmarks measure memorization, not intelligence

AnalysisAI Models

2 hours ago

Chollet: static benchmarks measure memorization, not intelligence

François Chollet

@fchollet

Co-founder @ndea. Co-founder @arcprize. Creator of Keras and ARC-AGI. Author of 'Deep Learning with Python'.

United Statesfchollet.com

View on X

François Chollet

@fchollet

If your benchmark relies on a static dataset or sampling from a static distribution densely known at training time, then it is fundamentally measuring memorization/retrieval. Which might be fine if you're looking for a retrieval benchmark! But don't confuse it with intelligence.

2 hours ago