AnalysisAI Models
Jun 25, 4:00 AM
Quantization inflates reasoning: study reveals token inflation hidden cost
Paper shows low-bit post-training quantization reduces accuracy while increasing reasoning token count. Finds the effect across math, coding, and science QA tasks with quantized reasoning models.
·
Jun 25, 4:00 AM