Back to AIBriefs
How-ToAI Models

Dual RTX 3060 runs Qwen 3.6-27B at 30-50 t/s for $400

A user built a local LLM setup with dual RTX 3060 for $400 running Qwen 3.6-27B at 30-50 tokens/s. MTP decode reaches 40-60 t/s but prefill is slow at 300-500 t/s.

·
21 days ago
Dual RTX 3060 runs Qwen 3.6-27B at 30-50 t/s for $400 — AIBriefs