AnalysisAI Models
2 days ago
MTG Bench tests LLMs on Magic: The Gathering
A new benchmark evaluates LLMs' ability to play Magic: The Gathering, measuring strategic reasoning and rule adherence. Results show current models struggle with complex game mechanics.