How-ToDevelopers
29 days ago
Qwen 3.6 27B local setup guide compares backends on RTX 3090
Best results on 24GB RTX 3090: ik_llama.cpp + Qwen3.6-27B-MTP-IQ4_KS.gguf with 156k context and vision on CPU. Achieves ~1261 tok/s prefill and 72.9 tok/s decode on a ~5.9k prompt + 1k output benchmark.
·
29 days ago
