AnalysisAI ModelsJuly 1, 2026
GLM5.2 tested on Ascend GX10 with performance numbers
Reddit users report GLM5.2 achieves 400-500 tok/s prompt processing and ~15 tok/s output at 128k context on 4x Ascend GX10 hardware. Performance considered usable with quantization.