Phase 7

Phase 7 — Deep RL

Volumes 3–5: value function approximation, DQN family, policy gradients, actor-critic, and advanced policy optimization (chapters 21–50).

Module progress 0 of 6 completed (0%)

Note: The Phase 7 milestones page uses URL phase-4 in this site. Deep RL theory lives in Volumes 3–5; the demo page is a UI sample, not a substitute for chapters.

Chapters 21–30: linear approximation, DQN, replay, target networks, and extensions.

Learning materials

Volume 3 Open