Qwen3.6-27B Q8 hits 100+ t/s on dual GPU with tensor split

AnalysisAI Models

6 hours ago

Achieves ~100 tokens per second on RTX 5090 + 3090 Ti at Q8 quantization. Switching from layer split to tensor split mode doubled speed from 70+ t/s.

6 hours ago