LaunchDevelopersAI AgentsAI Models
Jun 18, 2:14 PM
Fable 5's WebGPU kernels run Gemma 4 at 255 tok/s in browser
Fable 5, an agentic optimizer, wrote custom WebGPU kernels achieving 255 tokens/s for Gemma 4 inference in-browser. The demo and kernels are now released on Hugging Face Spaces.
Jun 18, 2:14 PM
