AnalysisAI ModelsPolicy
22 hours ago
Off-Policy Replay method improves LLM unlearning efficiency
Proposes Off-Policy Replay, an RL-based method for efficient LLM unlearning that removes hazardous knowledge without full retraining. Demonstrates cost-effective alternative while preserving general utility in pretrained models.
·
22 hours ago