Qwen 3.6 27B scores 2% on DeepSWE benchmark

AnalysisAI Models

10 days ago

Qwen 3.6 27B scores 2% on DeepSWE benchmark

Scored 2% (1.79% rounded), placing 18/20, above Haiku 4.5 and Minimax M2.7. Benchmark took 70 hours, averaging 32 minutes per task with 44k output tokens per task.

10 days ago