AnalysisAI ModelsDevelopers
13 days ago
GPT-5.5 tops DeepSWE coding benchmark, beating Claude Opus 4.8

OpenAI Developers
@openaidevsOfficial updates for developers building with Codex & the OpenAI Platform • Service status: https://t.co/kZwnwdYYEq
developers.openai.com

OpenAI Developers
@OpenAIDevs
RT @reach_vb: GPT-5.5 is #1 on DeepSWE, a hard long-horizon coding benchmark 🔥 70% pass@1 vs 58% for Claude Opus 4.8. And GPT-5.5 gets th…