How-ToDevelopersAI Models
9 days ago
NVIDIA NVFP4 recipe boosts pretraining speed on Blackwell
NVFP4 4-bit mixed-precision training delivers 7x GEMM throughput vs FP8 on Hopper with no accuracy loss. The recipe in MaxText and TransformerEngine enables faster LLM pretraining on Blackwell GPUs.
·
9 days ago
