AnalysisAI ModelsScience
8 days ago
GTBench evaluates LLMs as math research assistants in graph theory
GTBench is a curriculum-grounded benchmark testing LLMs as mathematical reasoning assistants in graph theory. It provides structured tasks to assess reliability and problem-solving capabilities.
·
8 days ago