Quantization inflates reasoning: study reveals token inflation hidden cost — AIBriefs

Back to AIBriefs

AnalysisAI Models

Jun 25, 4:00 AM

Quantization inflates reasoning: study reveals token inflation hidden cost

Arxiv CS.AI (top papers)

Paper shows low-bit post-training quantization reduces accuracy while increasing reasoning token count. Finds the effect across math, coding, and science QA tasks with quantized reasoning models.

Quantization Inflates Reasoning: Token Inflation as a Hidden Cost of Low-Bit Reasoning Models3 days agoXinyu Lian, Walid Krichene, Beichen Huang, Masahiro Tanaka, Olatunji Ruwase, Li Zhang, Minjia Zhang

VeriBound: PAC-Bayesian Generalization Bounds for Process Reward Models Trained with Formal Verification Tools5 days agoAmirul Rahman, Mohammed Sabih Alsharari

VeryTrace: Verifying Reasoning Traces through Compilable Formalism and Structured Verification4 days agoNinghan Zhong, Ahmet Ege Tanriverdi, Kaan Kale, Sriram Vishwanath

Quantized Reasoning Models Think They Need to Think Longer, but They Do Not26 days agoSanae Lotfi, Polina Kirichenko, Steven Li, Zechun Liu

Beyond Trajectory Imitation: Strategy-Guided Policy Optimization for LLM Reasoning4 days agoTianyuan Shi, Canbin Huang, Bei Li, Xin Chen, Xiaojun Quan, Jingang Wang, Qifan Wang

·

Jun 25, 4:00 AM

Quantization inflates reasoning: study reveals token inflation hidden cost — AIBriefs