Gridworld
Overall Progress
0%
Gridworld discounted return from a sequence of actions.
The classic gridworld environment: states, actions, transitions, and terminal states.
Iterative policy evaluation on 4×4 gridworld.
Value iteration on 4×4 gridworld, optimal V and policy.
Code walkthrough for gridworld, iterative policy evaluation, and policy iteration.
Dyna-Q on 4×4 deterministic gridworld.
BFS planner for gridworld; compare with DP.
State visitation count bonus; exploration in gridworld.