Post-training course adds reasoning and DPO lectures

How-ToAI Models

13 hours ago

Post-training course adds reasoning and DPO lectures

@natolambert.bsky.social

A LLN - large language Nathan - (RL, RLHF, society, robotics), athlete, yogi, chef Writes http://interconnects.ai Prev Ai2/Olmo, HuggingFace, Berkeley, and normal places

View on Bluesky

Nathan Lambert

@natolambert.bsky.social

I launched 3 more videos in my post-training course! 1. Lecture 5: The rise of reasoning models 2. Lecture 6: DPO derivation, intuitions, and practice 3. A Q&A from readers on lectures 1-4 Course page: rlhfbook.com/course More soon!

13 hours ago