Back to AIBriefs
LaunchAI Models

Xiaomi claims 1,000+ tps on 1T model with 8-GPU server

Xiaomi MiMo-V2.5-Pro UltraSpeed achieves over 1,000 tokens/s decode speed on a 1-trillion-parameter MoE model using a standard 8-GPU server. The API is available June 9–23 at 3x the standard price for 10x speed.

··Discuss
9 days ago
Xiaomi claims 1,000+ tps on 1T model with 8-GPU server — AIBriefs