AnalysisAI Models
Jun 23, 3:12 PM
Reddit user maps KV cache quantization KLD for Qwen3.6 and Gemma4
A Reddit user evaluated KLD of KV cache quantization for Qwen3.6-35B-A3B and Gemma4-E2B QAT. Results show q8/q8 nearly free on both, q4/q4 usable on Qwen but catastrophic on Gemma.
·
Jun 23, 3:12 PM
