Google releases Gemma 4 QAT models for efficient on-device inference

LaunchAI ModelsDevelopers

8 days ago

Google releases Gemma 4 QAT models for efficient on-device inference

New quantization-aware training checkpoints reduce Gemma 4 E2B memory to 1GB for mobile deployment. QAT minimizes quality loss compared to standard post-training quantization, enabling local inference on consumer hardware.

8 days ago