AnalysisAI ModelsAI Agents
3 hours ago
Featured··
Baseten researchers discuss KV cache compression for agents
The video explores compressing the KV cache to enable near-lossless retrieval of relevant context at inference time for long-horizon agents. The team from Baseten covers selection vs. synthesis, amortization, and practical trade-offs.
·
3 hours ago