AnalysisAI Models
28 days ago
Qwen3.6:27B user shares quantization configs for 16GB VRAM
A user targets >50 tg/s and >800 pp/s on a 16GB RTX 5080 with quantization of Qwen3.6:27B. They offloaded the vision model to free VRAM, noting Qwen3.5:9B is faster but less intelligent.
·
28 days ago