Reddit user maps KV cache quantization KLD for Qwen3.6 and Gemma4

AnalysisAI Models

Jun 23, 3:12 PM

Reddit user maps KV cache quantization KLD for Qwen3.6 and Gemma4

A Reddit user evaluated KLD of KV cache quantization for Qwen3.6-35B-A3B and Gemma4-E2B QAT. Results show q8/q8 nearly free on both, q4/q4 usable on Qwen but catastrophic on Gemma.

Jun 23, 3:12 PM