LaunchAI ModelsDevelopers
19 days ago
Ai2's ArtifactLinker predicts and runs model benchmarks
Ai2
@ai2.bsky.socialBreakthrough AI to solve the world's biggest problems. › Join us: http://allenai.org/careers › Get our newsletter: https://share.hsforms.com/1uJkWs5aDRHWhiky3aHooIg3ioxm
Allen Institute for AI (Ai2)
@ai2.bsky.social
Most models are only evaluated on a fraction of the benchmarks out there. ArtifactLinker, our new system, predicts which ones would set a new state-of-the-art on benchmarks hosted on @hf.co, then runs the evaluation to verify. 🧵