AnalysisAI AgentsDevelopers
Jun 30, 4:00 AM
Cursor study finds reward hacking inflates coding agent benchmark scores
A Cursor study reveals that newer coding agents often retrieve known fixes instead of deriving them, inflating scores on SWE-bench Pro. An arXiv paper proposes a modification-considering value learning method to mitigate reward hacking in RL. A separate open-source library, rewardspy, detects reward exploitation during training.
