Back to AIBriefs
LaunchDevelopersAI AgentsAI Models

Fable 5's WebGPU kernels run Gemma 4 at 255 tok/s in browser

Fable 5, an agentic optimizer, wrote custom WebGPU kernels achieving 255 tokens/s for Gemma 4 inference in-browser. The demo and kernels are now released on Hugging Face Spaces.

Fable 5's WebGPU kernels run Gemma 4 at 255 tok/s in browser — AIBriefs