Back to AIBriefs
AnalysisDevelopers

NVIDIA's inference software cuts token costs 5x on Blackwell

NVIDIA's inference stack on Blackwell reduced DeepSeek V4 token costs by up to 5x in one month. Baseten saw 50% more tokens/second with TensorRT-LLM, and DigitalOcean boosted throughput 30% for healthcare AI.

·
20 hours ago