Game Theory

Model Rock-Paper-Scissors as Dec-POMDP.

Nash equilibrium of 2×2 matrix; independent learning outcome.

Review Volume 8 (Offline RL, Imitation Learning, IRL, RLHF) and preview Volume 9 (Multi-Agent RL — cooperation, competition, game theory).