Back to AIBriefs
AnalysisAI Models

ObviousBench reveals regression from Opus 4.6 to 4.7

Reddit user pawofdoom created ObviousBench, a benchmark for stupid mistakes, and found Claude Opus 4.6 outperforms 4.7. The benchmark tests easy questions like spelling "Google" that top models still fail.

··Discuss
8 hours ago