Deepseek V4 on dual DGX Sparks: 40 tk/s single 1M context, 350 tk/s aggregate

AnalysisAI Models

Jun 14, 9:07 AM

Deepseek V4 on dual DGX Sparks: 40 tk/s single 1M context, 350 tk/s aggregate

Achieves 40 tokens/s on single 1M context and 350 tokens/s aggregate. Includes benchmarks against RTX Pro 6000 and Mac M2 Ultra 192. Author credits Nvidia community recipes for the setup.

Jun 14, 9:07 AM