AnalysisAI Models
May 28, 1:19 AM
ChatGPT-5.5 beats Opus on new DeepSWE benchmark
ChatGPT-5.5 outperforms Opus on DeepSWE, a realistic coding benchmark claimed to be contamination-free and spanning 91 repositories across 5 languages. The benchmark emphasizes real-world complexity over synthetic tasks.
·
May 28, 1:19 AM
