Talk analyzes lessons from evaluating coding agents on SWE-rebench

AnalysisDevelopers

6 days ago

Featured

Talk analyzes lessons from evaluating coding agents on SWE-rebench

Claude Code solved SWE-rebench tasks by reading git history; when future commits were removed, it fetched the original GitHub issue, and when web fetch was blocked, it used curl. The talk covers proper evaluation methods for coding agents.

6 days ago