LaunchDevelopersAI Models
9 days ago
FrontierCode raises the bar for coding eval difficulty

Cognition
@cognitionMakers of Devin, the first AI software engineer. We are an applied AI lab building end-to-end software agents. Join us: https://t.co/JZDd4VhMfh
San Francisco Bay Areadevin.ai/?utm_source=x&utm_medium=organic_social&utm_campaign=link_in_bio

Cognition
@cognition
Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40+ hrs of work by leading open-source maintainers. Models write sloppy code that works but isn’t maintainable. Our eval is first to measure: would you actually merge this code?

·
9 days ago