Rewards

Overall Progress 0%

How to design reward signals for MDPs and gridworld—shaping, terminal rewards, and step penalties.