AnalysisPolicy
13 days ago
OpenAI publishes playbook for trustworthy third-party evaluations
The playbook outlines how to assess model capabilities, safeguards, and validity for frontier systems. It aims to standardize evaluation practices for consistent and reliable assessments.
13 days ago