Back to AIBriefs
LaunchAI Models

Evalatro: open benchmark where LLMs play Balatro

Evalatro is an open benchmark that tests LLMs by having them play the real Balatro card game in real time. It started as a personal project to get LLM advice on levels and evolved into a full evaluation suite.

·
16 hours ago
Evalatro: open benchmark where LLMs play Balatro — AIBriefs