User achieves 100 tps with DifussionGemma 4 on 4x7900xtx

AnalysisAI Models

2 days ago

User achieves 100 tps with DifussionGemma 4 on 4x7900xtx

User reports 100 tokens/s generation speed on 4x7900xtx, with total throughput around 45-60 t/s including prompt processing. GPU KV cache holds 152,671 tokens, with max concurrency of 1.16x for 131k token requests.

2 days ago