AnalysisAI Models
6 hours ago
Qwen3.6-27B Q8 hits 100+ t/s on dual GPU with tensor split
Achieves ~100 tokens per second on RTX 5090 + 3090 Ti at Q8 quantization. Switching from layer split to tensor split mode doubled speed from 70+ t/s.
·
6 hours ago
