AnalysisAI Models
7 days ago
Paper proposes stage-specific data sets for SFT-then-RL in SLMs
The paper argues that data strategy should align with the distinct roles of SFT and RL stages. It proposes stage-specific data sets for small language model reasoning post-training.
·
7 days ago