AnalysisAI Models
2 hours ago
Featured
Frontier evals lead discusses model evaluation challenges

OpenAI
@openaiOpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity. We’re hiring: https://t.co/dJGr6LgzPA
openai.com

OpenAI
@OpenAI
Let’s talk about evals. We’re always looking for better ways to measure and forecast model progress, especially as benchmarks get saturated or gamed. @tejalpatwardhan, who leads our frontier evals team, spoke to @andrewmayne about why evals matter and what models need to be