Frontier evals lead discusses model evaluation challenges

AnalysisAI Models

2 hours ago

Featured

Frontier evals lead discusses model evaluation challenges

OpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity. We’re hiring: https://t.co/dJGr6LgzPA

openai.com

View on X

OpenAI

@OpenAI

Let’s talk about evals. We’re always looking for better ways to measure and forecast model progress, especially as benchmarks get saturated or gamed. @tejalpatwardhan, who leads our frontier evals team, spoke to @andrewmayne about why evals matter and what models need to be

Why Tejal Patwardhan stopped underestimating the models - Episode 212 hours agoOpenAI

2 hours ago