Back to AIBriefs
AnalysisAI Models

OpenMythos benchmarks published, showing SWE-bench gap vs Qwen 3.6 27B

Benchmarks for the OpenMythos model are released, revealing a discrepancy in SWE-bench performance compared to Qwen 3.6 27B's official numbers. The Qwen team used a different eval harness and filtered benchmark problems, which likely accounts for the difference.

·
Jun 23, 6:56 PM
OpenMythos benchmarks published, showing SWE-bench gap vs Qwen 3.6 27B — AIBriefs