AA-Briefcase benchmark tests models on agentic knowledge work tasks

LaunchAI ModelsAI Agents

Jun 18, 11:01 PM

AA-Briefcase benchmark tests models on agentic knowledge work tasks

Artificial Analysis

@artificialanlys

Independent analysis of AI

San Franciscoartificialanalysis.ai

View on X

Artificial Analysis

@ArtificialAnlys

Open weights models make up the majority of the cost-performance Pareto frontier on AA-Briefcase, our new agentic knowledge work benchmark Last week we released AA-Briefcase, our proprietary agentic knowledge work benchmark testing models on long horizon tasks built by industry

Announcing AA-Briefcase, the benchmark for the next era of agentic knowledge work AA-Briefcase is...4 days agoArtificial Analysis

Jun 18, 11:01 PM