Scales test-time compute for Qwen-3.6-27B and Gemma-4-31B to beat Claude Mythos

AnalysisAI Models

Jun 12, 8:55 PM

Scales test-time compute for Qwen-3.6-27B and Gemma-4-31B to beat Claude Mythos

User scaled test-time compute on Qwen-3.6-27B and Gemma-4-31B with a scaffold using 25-40x more compute than baseline. Settings included branches exploration breadth 5, iterative corrections loop depth 10, and 6 branch-aware selective hypotheses revised every 2 iterations. The approach reportedly surpassed Claude Mythos in code optimization tasks.

Jun 12, 8:55 PM