LaunchAI Models
Jun 23, 3:00 PM
Boost Inference Performance up to 15x on NVIDIA Blackwell Using DFlash Speculative…
DFlash, an open-source block diffusion drafter, boosts inference for gpt-oss-120b on NVIDIA Blackwell by up to 15x. It nearly doubles interactivity for Llama 3.1 8B vs EAGLE-3, with 20 checkpoints available on Hugging Face.
·
Jun 23, 3:00 PM
