Back to AIBriefs
AnalysisAI Models
Featured

Frontier evals lead discusses model evaluation challenges

OpenAI avatar
OpenAI
@OpenAI

Let’s talk about evals. We’re always looking for better ways to measure and forecast model progress, especially as benchmarks get saturated or gamed. @tejalpatwardhan, who leads our frontier evals team, spoke to @andrewmayne about why evals matter and what models need to be

Frontier evals lead discusses model evaluation challenges — AIBriefs