Skip to main content
Home
Learn
Learning path
Math for RL
Preliminary
Prerequisites
ML Foundations
DL Foundations
Curriculum
🧪 Lab (Python)
Glossary
Assessments
Appendix
Course outline
search
tags
Archives
Preferences
Overall Progress
0%
Step 1 — Vol 8 · Ch 10
Completed
Chapter 80: RL from Human Feedback (RLHF) Basics
Bradley-Terry from pairwise comparisons; train policy with PPO.