Back to AIBriefs
LaunchAI ModelsAI Agents

AA-Briefcase benchmark tests models on agentic knowledge work tasks

Artificial Analysis avatar
Artificial Analysis
@ArtificialAnlys

Open weights models make up the majority of the cost-performance Pareto frontier on AA-Briefcase, our new agentic knowledge work benchmark Last week we released AA-Briefcase, our proprietary agentic knowledge work benchmark testing models on long horizon tasks built by industry

·
Jun 18, 11:01 PM
AA-Briefcase benchmark tests models on agentic knowledge work tasks — AIBriefs