LaunchDevelopersAI Agents
8 hours ago
AA-Briefcase benchmark launched for agentic knowledge work

Artificial Analysis
@artificialanlysIndependent analysis of AI
San Franciscoartificialanalysis.ai

Artificial Analysis
@ArtificialAnlys
Announcing AA-Briefcase, the benchmark for the next era of agentic knowledge work AA-Briefcase is our new benchmark for testing models on long-horizon knowledge work tasks in complex projects built by industry experts. Models are evaluated on multi-week projects, each with many
