Chapter 62: Intrinsic Motivation
Learning objectives Design an intrinsic reward based on state visitation count: bonus = \(1/\sqrt{\text{count}}\) (or similar) so rarely visited states are more attractive. Implement an agent that uses total reward = extrinsic + intrinsic and compare exploration behavior (e.g. coverage of the state space) with an agent that uses only extrinsic reward. Relate to curiosity and exploration in game AI and robot navigation. Concept and real-world RL Intrinsic motivation gives the agent a bonus for visiting novel or surprising states, so it explores even when extrinsic reward is sparse. Count-based bonus \(1/\sqrt{N(s)}\) (inverse square root of visit count) encourages visiting states that have been seen fewer times. In game AI and robot navigation, this can help discover the goal; in recommendation, novelty bonuses encourage diversity. The combination extrinsic + intrinsic balances exploitation (reward) and exploration (novelty). ...