Back to AIBriefs
AnalysisAI Models

SWE-rebench leaderboard updated with 110 new tasks

The SWE-rebench leaderboard added 110 fresh Python tasks from GitHub PRs, covering results for GPT-5.5, Opus 4.7, Cursor Composer 2.5, and Kimi K2.6. Methodology changes include model updates and configuration adjustments for more complex evaluations.

··Discuss
21 days ago
SWE-rebench leaderboard updated with 110 new tasks — AIBriefs