Back to AIBriefs
LaunchDevelopersAI Models

FrontierCode raises the bar for coding eval difficulty

Cognition avatar
Cognition
@cognition

Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40+ hrs of work by leading open-source maintainers. Models write sloppy code that works but isn’t maintainable. Our eval is first to measure: would you actually merge this code?

·
9 days ago