Tutorial: compress LLMs with FP8, GPTQ, SmoothQuant using llmcompressor

How-ToDevelopersAI Models

29 days ago

Tutorial: compress LLMs with FP8, GPTQ, SmoothQuant using llmcompressor

Tutorial covers FP8 dynamic, GPTQ W4A16, and SmoothQuant W8A8 quantization starting from an FP16 baseline. Uses the llmcompressor library to apply post-training quantization and compare compression strategies.

29 days ago