FrontierCode raises the bar for coding eval difficulty

LaunchDevelopersAI Models

9 days ago

FrontierCode raises the bar for coding eval difficulty

Makers of Devin, the first AI software engineer. We are an applied AI lab building end-to-end software agents. Join us: https://t.co/JZDd4VhMfh

San Francisco Bay Areadevin.ai/?utm_source=x&utm_medium=organic_social&utm_campaign=link_in_bio

View on X

Cognition

@cognition

Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40+ hrs of work by leading open-source maintainers. Models write sloppy code that works but isn’t maintainable. Our eval is first to measure: would you actually merge this code?

9 days ago