TD | Reinforcement Learning Curriculum

Dynamic programming, Monte Carlo vs TD, on-policy vs off-policy, and Q-learning — with explanations and examples.

5 quick questions after Chapters 11–15 of Volume 2. Check you're ready to continue.

TD(0) prediction for blackjack; compare with Monte Carlo.

15 short drill problems for Volume 2: Monte Carlo, TD(0), SARSA, Q-learning, and n-step methods.