AnalysisPolicyAI Models
7 days ago
BiasGRPO stabilizes LLM bias mitigation with group-relative optimization
The paper introduces BiasGRPO, which uses group-relative policy optimization to stabilize bias mitigation in LLMs under high-variance reward conditions. Unlike verifiable tasks, bias mitigation lacks a single ground truth, making alignment challenging.
·
7 days ago