Nemotron 3 Ultra has 550B total parameters (55B active) with a hybrid Mamba-Transformer MoE architecture and 1M context. It is optimized for long-running agents, claiming up to 5x faster inference and 30% lower cost, with open weights released under OpenMDW 1.1.