Model-Based

Compare Dreamer and PPO sample efficiency on Walker.

Go to Chapter 51: Model-Free vs. Model-Based RL →

MuZero: model in latent space; reward prediction.

Plot true vs predicted states; compounding error visualization.

Go to Chapter 60: Visualizing Model-Based Rollouts →

Review Volume 5 (PPO, TRPO, SAC) and preview Volume 6 (Model-Based RL — learning world models and planning).

Go to Volume 5 Review & Bridge to Volume 6 →

Review Volume 6 (Model-Based RL, MCTS, Dyna-Q, world models) and preview Volume 7 (Exploration — intrinsic motivation, curiosity, and sparse rewards).

Go to Volume 6 Review & Bridge to Volume 7 →

Model-Based

Chapter 51: Model-Free vs. Model-Based RL

Chapter 56: MuZero Intuition

Chapter 60: Visualizing Model-Based Rollouts

Volume 5 Review & Bridge to Volume 6

Volume 6 Review & Bridge to Volume 7