Back to AIBriefs
How-ToDevelopersAI Models

Tutorial: compress LLMs with FP8, GPTQ, SmoothQuant using llmcompressor

Tutorial covers FP8 dynamic, GPTQ W4A16, and SmoothQuant W8A8 quantization starting from an FP16 baseline. Uses the llmcompressor library to apply post-training quantization and compare compression strategies.

·
29 days ago
Tutorial: compress LLMs with FP8, GPTQ, SmoothQuant using llmcompressor — AIBriefs