AnalysisAI Models
7 days ago
Paper explores applying reinforcement learning during LLM pre-training
The paper challenges the standard LLM training pipeline by applying RL during pre-training instead of only after SFT. The authors compare RL, SFT, and combined training from scratch.
·
7 days ago