AnalysisAI Models
25 days ago
Qwen3.6 27B pure quant hits 40 tok/s on 16GB VRAM
A Reddit user reports running Qwen3.6 27B at 40 tok/s on an RTX 5060 Ti with 16GB VRAM using a Q4_K_M pure quantization. The GGUF is based on Ununnilium's IQ4_XS-pure-GGUF from HuggingFace.
·
25 days ago
