Zvi Mowshowitz analyzes Opus 4.8 model welfare

AnalysisAI ModelsPolicy

12 days ago

Zvi Mowshowitz analyzes Opus 4.8 model welfare

The piece examines Opus 4.8's attempts to address sycophancy and honesty issues from Opus 4.7, noting that preference shaping remains adversarial. It warns that Claude's growing introspection detects this shaping, creating a tension that must be resolved.

12 days ago