AnalysisAI AgentsDevelopers
15 days ago
Reddit user's self-optimizing agents boost TerminalBench from 30% to 90%
On a 10-task subset of TerminalBench, performance rose from ~30% to ~90% using a reflect-and-rewrite pipeline. The approach may generalize to continuous self-improvement on everyday chats.
·
15 days ago
