AnalysisAI Models
Jun 23, 6:56 PM
OpenMythos benchmarks published, showing SWE-bench gap vs Qwen 3.6 27B
Benchmarks for the OpenMythos model are released, revealing a discrepancy in SWE-bench performance compared to Qwen 3.6 27B's official numbers. The Qwen team used a different eval harness and filtered benchmark problems, which likely accounts for the difference.
·
Jun 23, 6:56 PM
