Paper explores applying reinforcement learning during LLM pre-training

AnalysisAI Models

7 days ago

Paper explores applying reinforcement learning during LLM pre-training

The paper challenges the standard LLM training pipeline by applying RL during pre-training instead of only after SFT. The authors compare RL, SFT, and combined training from scratch.

7 days ago