Benchmarks don't match production workloads for coding agents, says Together AI

AnalysisAI ModelsDevelopers

27 days ago

Benchmarks don't match production workloads for coding agents, says Together AI

Together AI

@togethercompute

Accelerate inference, model shaping, and pre-training on a research-optimized platform.

San Francisco, CAtogether.ai

View on X

Together AI

@togethercompute

"One thing that we've been seeing recently is that inference benchmarks don't really match production workloads that well." - @realDanFu, VP of Kernels When you're running dozens of concurrent coding agents — each with 45k–200k token contexts — the benchmarks that matter are the

27 days ago