Actor-Critic
Overall Progress
0%
Why FA, policy gradient update, DQN exploration, experience replay, and actor-critic — with explanations.
5 quick questions after Chapters 31–35 of Volume 4. Check you're ready to continue.
Sketch two-network actor-critic; pseudocode for TD error updates.
A3C with multiprocessing workers; compare speed with A2C.
Simplified Dreamer: RSSM, imagination phase, actor-critic.
Review Volume 4 (Policy Gradients, Actor-Critic, DDPG, TD3) and preview Volume 5 (PPO, TRPO, SAC — stable, scalable policy optimization).