AnalysisDevelopers
7 hours ago
User discusses 4x Ascend GX10s for GLM5.2 inference
A Reddit user shares performance numbers for GLM5.2 on 4x Ascend GX10s: 400-500 tok/s prompt processing and ~15 tok/s output at 128k context. The user is considering the purchase for future open-source model inference.
·
7 hours ago