Ai2's ArtifactLinker predicts and runs model benchmarks

LaunchAI ModelsDevelopers

19 days ago

Ai2's ArtifactLinker predicts and runs model benchmarks

@ai2.bsky.social

Breakthrough AI to solve the world's biggest problems. › Join us: http://allenai.org/careers › Get our newsletter: https://share.hsforms.com/1uJkWs5aDRHWhiky3aHooIg3ioxm

View on Bluesky

Allen Institute for AI (Ai2)

@ai2.bsky.social

Most models are only evaluated on a fraction of the benchmarks out there. ArtifactLinker, our new system, predicts which ones would set a new state-of-the-art on benchmarks hosted on @hf.co, then runs the evaluation to verify. 🧵

RT @allen_ai: Most models are only evaluated on a fraction of the benchmarks out there....19 days agoHugging Face

19 days ago