ChatGPT-5.5 beats Opus on new DeepSWE benchmark

AnalysisAI Models

May 28, 1:19 AM

ChatGPT-5.5 beats Opus on new DeepSWE benchmark

ChatGPT-5.5 outperforms Opus on DeepSWE, a realistic coding benchmark claimed to be contamination-free and spanning 91 repositories across 5 languages. The benchmark emphasizes real-world complexity over synthetic tasks.

May 28, 1:19 AM