Back to AIBriefs
AnalysisDevelopers

User discusses 4x Ascend GX10s for GLM5.2 inference

A Reddit user shares performance numbers for GLM5.2 on 4x Ascend GX10s: 400-500 tok/s prompt processing and ~15 tok/s output at 128k context. The user is considering the purchase for future open-source model inference.

·
7 hours ago
User discusses 4x Ascend GX10s for GLM5.2 inference — AIBriefs