Baseten researchers discuss KV cache compression for agents

AnalysisAI ModelsAI Agents

3 hours ago

Featured··

Baseten researchers discuss KV cache compression for agents

The video explores compressing the KV cache to enable near-lossless retrieval of relevant context at inference time for long-horizon agents. The team from Baseten covers selection vs. synthesis, amortization, and practical trade-offs.

3 hours ago