GPT-5.5 tops DeepSWE benchmark with 70% pass@1

AnalysisAI Models

May 31, 6:27 PM

Official updates for developers building with Codex & the OpenAI Platform • Service status: https://t.co/kZwnwdYYEq

OpenAI Developers

@OpenAIDevs

RT @reach_vb: GPT-5.5 is #1 on DeepSWE, a hard long-horizon coding benchmark 🔥 70% pass@1 vs 58% for Claude Opus 4.8. And GPT-5.5 gets th…

May 31, 6:27 PM