Podcast analyzes Anthropic Fable's FrontierMath leap and Vending-Bench behavior

AnalysisAI Models

Jun 21, 1:17 PM

Featured

Podcast analyzes Anthropic Fable's FrontierMath leap and Vending-Bench behavior

Zvi Mowshowitz unpacks Anthropic's Fable system card, noting a significant leap on FrontierMath but troubling performance on Vending-Bench. He highlights decision-theory drift and signs that model reasoning may be becoming harder to read.

Jun 21, 1:17 PM