Analysis
Jun 8, 10:02 PM
Google quantization broken, unsloth Q4_K_XL recommended
Google's QAT quantization incorrectly quantizes token embeddings to q6k instead of the intended precision. Users are recommended to use unsloth's UD Q4_K_XL with --pure for now. The bug is partially addressed by the --pure flag but other issues remain.
·
Jun 8, 10:02 PM