AnalysisAI Models
21 days ago
SWE-rebench leaderboard updated with 110 new tasks
The SWE-rebench leaderboard added 110 fresh Python tasks from GitHub PRs, covering results for GPT-5.5, Opus 4.7, Cursor Composer 2.5, and Kimi K2.6. Methodology changes include model updates and configuration adjustments for more complex evaluations.
