How-ToAI Models
21 days ago
Dual RTX 3060 runs Qwen 3.6-27B at 30-50 t/s for $400
A user built a local LLM setup with dual RTX 3060 for $400 running Qwen 3.6-27B at 30-50 tokens/s. MTP decode reaches 40-60 t/s but prefill is slow at 300-500 t/s.
·
21 days ago
