Anthropic launches Claude Opus 4.8, tops GDPval-AA benchmark

LaunchAI Models

20 days ago

Anthropic launches Claude Opus 4.8, tops GDPval-AA benchmark

Artificial Analysis

@artificialanlys

Independent analysis of AI

San Franciscoartificialanalysis.ai

View on X

Artificial Analysis

@ArtificialAnlys

Anthropic just launched Claude Opus 4.8, and it is the new leader on our GDPval-AA benchmark for agentic real-world work tasks Opus 4.8 scored 1890 on GDPval-AA at launch with its 'max' effort setting, +137 points from Opus 4.7 and +121 points ahead of the next-best model, https://t.co/E1VvXO7T1O

20 days ago