ICM

Overall Progress 0%

ICM: forward model, prediction error as intrinsic reward; A2C on maze.

Review Volume 7 (Exploration, ICM, RND, Go-Explore, Meta-RL) and preview Volume 8 (Offline RL, Imitation Learning, RLHF).