Chapter 60: Visualizing Model-Based Rollouts
Learning objectives For a learned dynamics model (e.g. from Chapter 52), sample a starting state and generate a rollout of predicted states for a fixed action sequence. Plot the true states (from the environment) and the predicted states (from the model) on the same axes to visualize compounding error. Interpret the plot: where does the model diverge from reality? Concept and real-world RL Visualizing model rollouts vs real rollouts makes compounding error concrete: small 1-step errors accumulate and the predicted trajectory drifts. In robot navigation and model-based RL, this motivates short rollouts, ensemble methods, and uncertainty-aware planning. The same idea applies to trading models (predictions diverge over time) and dialogue (conversation dynamics). ...