LaunchAI ModelsDevelopers
Jun 26, 4:00 PM
Epoch AI launches MirrorCode benchmark for long-horizon AI coding
MirrorCode, co-developed with METR, tasks AI models with rebuilding 25 real-world programs without source code. The hardest tasks cost $2,600 per run and took 19 days of AI work; Claude Opus 4.7 leads with a 56% solve rate.
·
Jun 26, 4:00 PM
