Back to AIBriefs
AnalysisAI Models

Community PPO fine-tune of Qwen-35B-A3 beats GLM-5.2 and Qwen-350B

Trained with PPO on Qwen-35B-A3, the model outperforms GLM-5.2 and Qwen-350B on karpathy/autoresearch parameter-golf. User reports the generated ideas feel similar to Opus 4.8.

·
Jun 17, 12:35 PM
Community PPO fine-tune of Qwen-35B-A3 beats GLM-5.2 and Qwen-350B — AIBriefs