User reports 125 tok/s for Qwen3.6 q4xl on 2x 4060 Ti

AnalysisAI Models

18 days ago

User reports 125 tok/s for Qwen3.6 q4xl on 2x 4060 Ti

A Reddit user claims 125 tokens per second for Qwen3.6 q4xl on dual RTX 4060 Ti (32GB VRAM) under $1000 total hardware cost. The post highlights exceptional performance per dollar for local inference, with potential to reach 150 t/s with CUDA 13.3.

18 days ago