Monte Carlo

Dynamic programming, Monte Carlo vs TD, on-policy vs off-policy, and Q-learning — with explanations and examples.

5 quick questions after Chapters 11–15 of Volume 2. Check you're ready to continue.

First-visit MC prediction for blackjack.

Code walkthrough for Monte Carlo policy evaluation and Monte Carlo control, with and without exploring starts.

15 short drill problems for Volume 2: Monte Carlo, TD(0), SARSA, Q-learning, and n-step methods.