Back to AIBriefs
AnalysisAI ModelsPolicy

Off-Policy Replay method improves LLM unlearning efficiency

Proposes Off-Policy Replay, an RL-based method for efficient LLM unlearning that removes hazardous knowledge without full retraining. Demonstrates cost-effective alternative while preserving general utility in pretrained models.

·
22 hours ago
Off-Policy Replay method improves LLM unlearning efficiency — AIBriefs