Claude Mythos Preview beats human researchers 64% of the time

AnalysisAI Models

9 days ago

Claude Mythos Preview beats human researchers 64% of the time

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.

anthropic.com

View on X

Anthropic

@AnthropicAI

AI research is a series of next-step decisions. We looked at sessions where a human researcher took a wrong turn, showed Claude the session up to that point, and asked it what to do next. Mythos Preview improved on humans 64% of the time—up from 22% in 2024. https://t.co/Y0HLoktxrt

9 days ago