Qwen3.6:27B user shares quantization configs for 16GB VRAM

AnalysisAI Models

28 days ago

Qwen3.6:27B user shares quantization configs for 16GB VRAM

A user targets >50 tg/s and >800 pp/s on a 16GB RTX 5080 with quantization of Qwen3.6:27B. They offloaded the vision model to free VRAM, noting Qwen3.5:9B is faster but less intelligent.

28 days ago