Intrinsic Reward

Overall Progress 0%

ICM: forward model, prediction error as intrinsic reward; A2C on maze.

RND: fixed target, predictor; prediction error as intrinsic reward.