AnalysisAI ModelsAI Agents
6 days ago
AdaPlanBench: New benchmark for adaptive planning in LLM agents
Benchmark evaluates LLM agents on planning tasks where world and user constraints are progressively disclosed. It includes diverse scenarios and metrics for measuring adaptive performance.
·
6 days ago