Back to AIBriefs
How-ToAI Models

Post-training course adds reasoning and DPO lectures

Nathan Lambert avatar
Nathan Lambert
@natolambert.bsky.social

I launched 3 more videos in my post-training course! 1. Lecture 5: The rise of reasoning models 2. Lecture 6: DPO derivation, intuitions, and practice 3. A Q&A from readers on lectures 1-4 Course page: rlhfbook.com/course More soon!

·
13 hours ago
Post-training course adds reasoning and DPO lectures — AIBriefs