LaunchAI Models
9 days ago
Xiaomi claims 1,000+ tps on 1T model with 8-GPU server
Xiaomi MiMo-V2.5-Pro UltraSpeed achieves over 1,000 tokens/s decode speed on a 1-trillion-parameter MoE model using a standard 8-GPU server. The API is available June 9–23 at 3x the standard price for 10x speed.
9 days ago
