NVIDIA unveils Nemotron 3 Ultra, a 550B MoE reasoning model

LaunchAI Models

Jun 4, 11:48 AM

NVIDIA unveils Nemotron 3 Ultra, a 550B MoE reasoning model

Nemotron 3 Ultra has 550B total parameters (55B active) and supports a 1M token context window, achieving up to 350 tokens per second on agentic tasks. The open-source MoE model is designed for multi-turn agent workflows, including planning and tool use, at up to 30% lower cost than comparable models.

Nemotron 3 Ultra now available on AI Gateway22 days agoJerilyn Zheng

Jun 4, 11:48 AM