TD

Overall Progress 0%

Dynamic programming, Monte Carlo vs TD, on-policy vs off-policy, and Q-learning โ€” with explanations and examples.

5 quick questions after Chapters 11โ€“15 of Volume 2. Check you're ready to continue.

TD(0) prediction for blackjack; compare with Monte Carlo.

15 short drill problems for Volume 2: Monte Carlo, TD(0), SARSA, Q-learning, and n-step methods.