LaunchAI Models
3 hours ago
Epoch AI, METR launch MirrorCode benchmark
MirrorCode is a new long-horizon coding benchmark co-developed by Epoch AI and METR to measure the limits of autonomous AI coding abilities. It focuses on complex, multi-step tasks requiring extended reasoning.
·
3 hours ago
