Chapter 67: Meta-Learning (Learning to Learn)
Learning objectives Define a distribution of tasks (e.g. different goal positions in a gridworld) and sample tasks for meta-training. Implement a meta-training loop: for each task, collect data or run a few steps of adaptation, then update the meta-policy or meta-parameters to improve few-task performance. Explain the goal of meta-RL: learn an initialization or algorithm that adapts quickly to new tasks with few gradient steps or few episodes. Evaluate the meta-learned policy on held-out tasks with limited data and compare with training from scratch. Relate meta-RL to robot navigation (different goals or terrains) and game AI (different levels or opponents). Concept and real-world RL ...