How-ToAI Models
13 hours ago
Post-training course adds reasoning and DPO lectures
Nathan Lambert
@natolambert.bsky.socialA LLN - large language Nathan - (RL, RLHF, society, robotics), athlete, yogi, chef Writes http://interconnects.ai Prev Ai2/Olmo, HuggingFace, Berkeley, and normal places
Nathan Lambert
@natolambert.bsky.social
I launched 3 more videos in my post-training course! 1. Lecture 5: The rise of reasoning models 2. Lecture 6: DPO derivation, intuitions, and practice 3. A Q&A from readers on lectures 1-4 Course page: rlhfbook.com/course More soon!
·
13 hours ago