AnalysisAI Models
25 days ago
Qwen3.6-35B-A3B runs at +30 tps on 8GB GPU with Q4 quantization
Reddit user reports running Qwen3.6-35B-A3B with Q4 quantization and 262k context on an 8GB RTX 3070 Ti at +30 tokens per second. Context can be pushed to 1M but slows beyond 150k.
·
25 days ago
