Chapter 63: Curiosity-Driven Exploration (ICM)

Learning objectives Implement the Intrinsic Curiosity Module: a forward model that predicts next-state features from current state and action. Use prediction error (between predicted and actual next features) as intrinsic reward and combine it with A2C. Explain why prediction error encourages exploration in novel or stochastic parts of the state space. Compare exploration behavior (e.g. coverage, time to goal) with and without ICM on a sparse-reward maze. Relate curiosity-driven exploration to robot navigation and game AI where rewards are sparse. Concept and real-world RL ...

March 10, 2026 · 3 min · 624 words · codefrydev