Back to AIBriefs
AnalysisAI Models

User achieves 100 tps with DifussionGemma 4 on 4x7900xtx

User reports 100 tokens/s generation speed on 4x7900xtx, with total throughput around 45-60 t/s including prompt processing. GPU KV cache holds 152,671 tokens, with max concurrency of 1.16x for 131k token requests.

·
2 days ago
User achieves 100 tps with DifussionGemma 4 on 4x7900xtx — AIBriefs