Qwen3.6 27B pure quant hits 40 tok/s on 16GB VRAM

AnalysisAI Models

25 days ago

Qwen3.6 27B pure quant hits 40 tok/s on 16GB VRAM

A Reddit user reports running Qwen3.6 27B at 40 tok/s on an RTX 5060 Ti with 16GB VRAM using a Q4_K_M pure quantization. The GGUF is based on Ununnilium's IQ4_XS-pure-GGUF from HuggingFace.

25 days ago