Tabular Methods

Overall Progress 0%

Dynamic programming, Monte Carlo vs TD, on-policy vs off-policy, and Q-learning — with explanations and examples.

Memory for Backgammon Q-table; necessity of function approximation.