MoQ and GSQ improve low-bit GGUF quantizations

AnalysisAI Models

4 days ago

MoQ and GSQ improve low-bit GGUF quantizations

kaitchup.substack.com

MoQ and GSQ are new quantization methods for the GGUF format, aiming to improve quality at very low bit widths. This could enable higher quality 2-3 bit quantized models for local LLM inference.

··Discuss

4 days ago