AnalysisPolicy
Jun 18, 6:00 PM
Reinforcement learning towards broadly and persistently beneficial models
OpenAI reports that RL training on realistic scenarios targeting beneficial traits improves alignment across dozens of benchmarks. Gains generalize beyond training domains and persist under adversarial pressure. The dataset spans domains like health, science, education, and coding.
·
Jun 18, 6:00 PM
