Pendulum

Overall Progress 0%

Policy network for Pendulum: Gaussian mean and log-std; log-prob.

Go to Chapter 38: Continuous Action Spaces →

DDPG for Pendulum with OU noise and target networks.

Go to Chapter 39: Deep Deterministic Policy Gradient (DDPG) →