Chapter 66: Go-Explore Algorithm
Learning objectives Implement a simplified Go-Explore: an archive of promising states and a strategy to return to them and explore further. Explain the two-phase idea: (1) archive states that lead to high rewards or novelty, (2) select from the archive, return to that state, then take exploratory actions. Compare Go-Explore with random exploration (e.g. episodes to reach goal, or maximum reward reached) on a deterministic maze. Identify why “return” (resetting to an archived state) helps in hard exploration compared to always starting from the initial state. Relate Go-Explore to game AI (e.g. Montezuma’s Revenge) and robot navigation with sparse goals. Concept and real-world RL ...