User discusses 4x Ascend GX10s for GLM5.2 inference

AnalysisDevelopers

7 hours ago

User discusses 4x Ascend GX10s for GLM5.2 inference

A Reddit user shares performance numbers for GLM5.2 on 4x Ascend GX10s: 400-500 tok/s prompt processing and ~15 tok/s output at 128k context. The user is considering the purchase for future open-source model inference.

7 hours ago