How-ToDevelopersAI Models
29 days ago
Tutorial: compress LLMs with FP8, GPTQ, SmoothQuant using llmcompressor
Tutorial covers FP8 dynamic, GPTQ W4A16, and SmoothQuant W8A8 quantization starting from an FP16 baseline. Uses the llmcompressor library to apply post-training quantization and compare compression strategies.
·
29 days ago
