AnalysisAI Agents
8 days ago
DeskCraft benchmark evaluates desktop agents on professional workflows
DeskCraft benchmarks desktop agents on long-horizon professional tasks in creative and engineering software. It emphasizes human-in-the-loop collaboration where agents must proactively seek information and users provide additional context.
·
8 days ago